开发者

Date arithmetic using integer values

Problem

String concatenation is slowing down a query:

date(extract(YEAR FROM m.taken)||'-1-1') d1,
date(extract(YEAR FROM m.taken)||'-1-31') d2

This is realized in code as part of a string, which follows (where the p_ variables are integers, provided as input by end users):

date(extract(YEAR FROM m.taken)||''-'||p_month1||'-'||p_day1||''') d1,
date(extract(YEAR FROM m.taken)||''-'||p_month2||'-'||p_day2||''') d2

This part of the query runs in 3.2 seconds with the dates, and 1.5 seconds without, leading me to believe there is ample room for improvement.

The query's total run time is under 10 seconds; am looking to bring the entire query down to about 2 or 3 seconds. A hardware upgrade has already happened. ;-)

Version

PostgreSQL 8.4.4.

Question

What is a开发者_开发知识库 better way to create the date (presumably without concatenation)?

Update

This looks promising: PGTYPESdate_mdyjul

Many thanks!


Sadly, I dont think there is other way to build a date without texts concatenation.

Yes, frankly, I dislike the aproach Postgresql have here. It seems that most date manipulation must be made by extracting date fields as integers, casting them as text, appending them to more texts to create a textual representation of a date, and then telling postgres to parse that text as date... This smells bad to me, I instictively feel that building a date by parsing a string should be only done from textual inputs. But, I think, postgresql ties too strongly the data types handling with their textual representations. And so, for example, if I want to build a date from three integer values (D,M,Y) I MUST (if I'm not mistaken) build a string and make PG parse it. I feel so unclean doing this...

Rant aside, I doubt that this can slow down much your performance.


Wow. I'm surprised, but using the functions from this page - specifically the one to build a date value from three integers - which really do nothing more expose the internal C date functions, really is a lot faster. Benchmarking for me show that creating the dates that way was much faster.

First one is the implementation of the "dateserial" function:

postgres=# select to_date(a,1,3) 
postgres-# from generate_series(100,1000000) as v(a);

Time: 1365.851 ms

postgres=# select (a::text||'-01-03')::date from 
postgres-# generate_series(100,1000000) as v(a);

Time: 3454.224 ms

Full Solution

Edit dateserial.c:

#include "postgres.h"
#include "utils/date.h"
#include "utils/nabstime.h"

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

Datum dateserial(PG_FUNCTION_ARGS);

PG_FUNCTION_INFO_V1 (dateserial);

Datum
dateserial(PG_FUNCTION_ARGS) {
  int32 p_year = PG_GETARG_INT32(0);
  int32 p_month = PG_GETARG_INT32(1);
  int32 p_day = PG_GETARG_INT32(2);

  PG_RETURN_DATEADT( date2j( p_year, p_month, p_day ) - POSTGRES_EPOCH_JDATE );
}

Edit Makefile:

MODULES = dateserial
PGXS := $(shell pg_config --pgxs)
include $(PGXS)

Edit inst.sh (optional):

#!/bin/bash

make clean && make && strip *.so && make install && /etc/init.d/postgresql-8.4 restart

Run bash inst.sh.

Create a SQL function dateserial:

CREATE OR REPLACE FUNCTION dateserial(integer, integer, integer)
  RETURNS date AS
'$libdir/dateserial', 'dateserial'
  LANGUAGE 'c' IMMUTABLE STRICT
  COST 1;
ALTER FUNCTION dateserial(integer, integer, integer) OWNER TO postgres;

Test the function:

SELECT dateserial( 2007::int, 5, 5 )


Another alternative would be to create a function index on the concatenation. This works in more general cases where there isn't a better data type available.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜