Date arithmetic using integer values
Problem
String concatenation is slowing down a query:
date(extract(YEAR FROM m.taken)||'-1-1') d1,
date(extract(YEAR FROM m.taken)||'-1-31') d2
This is realized in code as part of a string, which follows (where the p_
variables are integers, provided as input by end users):
date(extract(YEAR FROM m.taken)||''-'||p_month1||'-'||p_day1||''') d1,
date(extract(YEAR FROM m.taken)||''-'||p_month2||'-'||p_day2||''') d2
This part of the query runs in 3.2 seconds with the dates, and 1.5 seconds without, leading me to believe there is ample room for improvement.
The query's total run time is under 10 seconds; am looking to bring the entire query down to about 2 or 3 seconds. A hardware upgrade has already happened. ;-)
Version
PostgreSQL 8.4.4.
Question
What is a开发者_开发知识库 better way to create the date (presumably without concatenation)?
Update
This looks promising: PGTYPESdate_mdyjul
Many thanks!
Sadly, I dont think there is other way to build a date without texts concatenation.
Yes, frankly, I dislike the aproach Postgresql have here. It seems that most date manipulation must be made by extracting date fields as integers, casting them as text, appending them to more texts to create a textual representation of a date, and then telling postgres to parse that text as date... This smells bad to me, I instictively feel that building a date by parsing a string should be only done from textual inputs. But, I think, postgresql ties too strongly the data types handling with their textual representations. And so, for example, if I want to build a date from three integer values (D,M,Y) I MUST (if I'm not mistaken) build a string and make PG parse it. I feel so unclean doing this...
Rant aside, I doubt that this can slow down much your performance.
Wow. I'm surprised, but using the functions from this page - specifically the one to build a date value from three integers - which really do nothing more expose the internal C date functions, really is a lot faster. Benchmarking for me show that creating the dates that way was much faster.
First one is the implementation of the "dateserial" function:
postgres=# select to_date(a,1,3)
postgres-# from generate_series(100,1000000) as v(a);
Time: 1365.851 ms
postgres=# select (a::text||'-01-03')::date from
postgres-# generate_series(100,1000000) as v(a);
Time: 3454.224 ms
Full Solution
Edit dateserial.c
:
#include "postgres.h"
#include "utils/date.h"
#include "utils/nabstime.h"
#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif
Datum dateserial(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1 (dateserial);
Datum
dateserial(PG_FUNCTION_ARGS) {
int32 p_year = PG_GETARG_INT32(0);
int32 p_month = PG_GETARG_INT32(1);
int32 p_day = PG_GETARG_INT32(2);
PG_RETURN_DATEADT( date2j( p_year, p_month, p_day ) - POSTGRES_EPOCH_JDATE );
}
Edit Makefile
:
MODULES = dateserial
PGXS := $(shell pg_config --pgxs)
include $(PGXS)
Edit inst.sh
(optional):
#!/bin/bash
make clean && make && strip *.so && make install && /etc/init.d/postgresql-8.4 restart
Run bash inst.sh
.
Create a SQL function dateserial
:
CREATE OR REPLACE FUNCTION dateserial(integer, integer, integer)
RETURNS date AS
'$libdir/dateserial', 'dateserial'
LANGUAGE 'c' IMMUTABLE STRICT
COST 1;
ALTER FUNCTION dateserial(integer, integer, integer) OWNER TO postgres;
Test the function:
SELECT dateserial( 2007::int, 5, 5 )
Another alternative would be to create a function index on the concatenation. This works in more general cases where there isn't a better data type available.
精彩评论