How to set value of composite variable field using dynamic SQL
Given this type:
-- Just for testing purposes:
CREATE TYPE testType as (name text)
I can get the value of a field dynamically with this function:
CREATE OR REPLACE FUNCTION get_field(object anyelement, field text) RETURNS text as
$BODY$
DECLARE
value text;
BEGIN
EXECUTE 'SELECT $1."' || field || '"'
USING object
INTO value;
return value;
END;
$BODY$
LANGUAGE plpgsql
Calling get_field('(david)':开发者_StackOverflow社区:testType, 'name')
works as expected returning "david".
But how can I set a value of a field in a composite type? I've tried these functions:
CREATE OR REPLACE FUNCTION set_field_try1(object anyelement, field text, value text)
RETURNS anyelement
as
$BODY$
DECLARE
value text;
BEGIN
EXECUTE '$1."' || field || '" := $2'
USING object, value;
return object;
END;
$BODY$
LANGUAGE plpgsql
CREATE OR REPLACE FUNCTION set_field_try2(object anyelement, field text, value text)
RETURNS anyelement
as
$BODY$
DECLARE
value text;
BEGIN
EXECUTE 'SELECT $1 INTO $2."' || field || '"'
USING value, object;
return object;
END;
$BODY$
LANGUAGE plpgsql
CREATE OR REPLACE FUNCTION set_field_try3(object anyelement, field text, value text)
RETURNS anyelement
as
$BODY$
DECLARE
value text;
BEGIN
EXECUTE 'BEGIN $1."' || field || '" := $2; SELECT $1; END;'
INTO object
USING value, object;
return object;
END;
$BODY$
LANGUAGE plpgsql
and some variations.
Calling set_field_tryX
doesn't work. I always get "ERROR: syntax error at or near...".
How can I accomplish this?
Notes:
- The parameter is
anyelement
and the field can be any field in the composite type. I can't just use object.name. - I'm concerned about SQL injection. Any advice in this would be appreciated but it is not my question.
Faster with hstore
Since Postgres 9.0, with the additional module hstore
installed in your database there is a very simple and fast solution with the #=
operator that ...
replace[s] fields in
record
with matching values fromhstore
.
To install the module:
CREATE EXTENSION hstore;
Examples:
SELECT my_record #= '"field"=>"value"'::hstore; -- with string literal
SELECT my_record #= hstore(field, value); -- with values
Values have to be cast to text
and back, obviously.
Example plpgsql functions with more details:
- Endless loop in trigger function
- Assign to NEW by key in a Postgres trigger
Now works with json
/ jsonb
, too!
There are similar solutions with json
(pg 9.3+) or jsonb
(pg 9.4+)
SELECT json_populate_record (my_record, json_build_object('key', 'new-value');
The functionality was undocumented, but it's official since Postgres 13. The manual:
However, if base isn't NULL then the values it contains will be used for unmatched columns.
So you can take any existing row and fill arbitrary fields (overwriting what's in it).
Major advantages of json
vs hstore
:
- works with stock Postgres so you don't need an additional module.
- also works for nested array and composite types.
Minor disadvantage: a bit slower.
See @Geir's added answer for details.
Without hstore
and json
If you are on an older version or cannot install the additional module hstore
or cannot assume it's installed, here is an improved version of what I posted previously. Still slower than the hstore
operator, though:
CREATE OR REPLACE FUNCTION f_setfield(INOUT _comp_val anyelement
, _field text, _val text)
RETURNS anyelement
LANGUAGE plpgsql STABLE AS
$func$
BEGIN
EXECUTE 'SELECT ' || array_to_string(ARRAY(
SELECT CASE WHEN attname = _field
THEN '$2'
ELSE '($1).' || quote_ident(attname)
END AS fld
FROM pg_catalog.pg_attribute
WHERE attrelid = pg_typeof(_comp_val)::text::regclass
AND attnum > 0
AND attisdropped = FALSE
ORDER BY attnum
), ',')
USING _comp_val, _val
INTO _comp_val;
END
$func$;
Call:
CREATE TEMP TABLE t( a int, b text); -- Composite type for testing
SELECT f_setfield(NULL::t, 'a', '1');
Notes
An explicit cast of the value
_val
to the target data type is not necessary, a string literal in the dynamic query would be coerced automatically, obviating the subquery onpg_type
. But I took it one step further:Replace
quote_literal(_val)
with direct value insertion via theUSING
clause. Saves one function call and two casts, and is safer anyway.text
is coerced to the target type automatically in modern PostgreSQL. (Did not test with versions before 9.1.)array_to_string(ARRAY())
is faster thanstring_agg()
.No variables needed, no
DECLARE
. Fewer assignments.No subquery in the dynamic SQL.
($1).field
is faster.pg_typeof(_comp_val)::text::regclass
does the same as
(SELECT typrelid FROM pg_catalog.pg_type WHERE oid = pg_typeof($1)::oid)
for valid composite types, just faster.
This last modification is built on the assumption thatpg_type.typname
is always identical to the associatedpg_class.relname
for registered composite types, and the double cast can replace the subquery. I ran this test in a big database to verify, and it came up empty as expected:
SELECT *
FROM pg_catalog.pg_type t
JOIN pg_namespace n ON n.oid = t.typnamespace
WHERE t.typrelid > 0 -- exclude non-composite types
AND t.typrelid IS DISTINCT FROM
(quote_ident(n.nspname ) || '.' || quote_ident(typname))::regclass
- The use of an
INOUT
parameter obviates the need for an explicitRETURN
. This is just a notational shortcut. Pavel won't like it, he prefers an explicitRETURN
statement ...
Everything put together this is twice as fast as the previous version.
Original (outdated) answer:
The result is a version that's ~ 2.25 times faster. But I probably couldn't have done it without building on Pavel's second version.
In addition, this version avoids most of the casting to text and back by doing everything within a single query, so it should be much less error prone.
Tested with PostgreSQL 9.0 and 9.1.
CREATE FUNCTION f_setfield(_comp_val anyelement, _field text, _val text)
RETURNS anyelement
LANGUAGE plpgsql STABLE AS
$func$
DECLARE
_list text;
BEGIN
_list := (
SELECT string_agg(x.fld, ',')
FROM (
SELECT CASE WHEN a.attname = $2
THEN quote_literal($3) || '::'|| (SELECT quote_ident(typname)
FROM pg_catalog.pg_type
WHERE oid = a.atttypid)
ELSE quote_ident(a.attname)
END AS fld
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = (SELECT typrelid
FROM pg_catalog.pg_type
WHERE oid = pg_typeof($1)::oid)
AND a.attnum > 0
AND a.attisdropped = false
ORDER BY a.attnum
) x
);
EXECUTE 'SELECT ' || _list || ' FROM (SELECT $1.*) x'
USING $1
INTO $1;
RETURN $1;
END
$func$;
I wrote a second version of setfield function. It work on postgres 9.1 I didn't test it on older versions. It's not a miracle (from performance view), but it is more robust and about 8 times faster than the previous.
CREATE OR REPLACE FUNCTION public.setfield2(anyelement, text, text)
RETURNS anyelement
LANGUAGE plpgsql
AS $function$
DECLARE
_name text;
_values text[];
_value text;
_attnum int;
BEGIN
FOR _name, _attnum
IN SELECT a.attname, a.attnum
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = (SELECT typrelid
FROM pg_type
WHERE oid = pg_typeof($1)::oid)
AND a.attnum > 0
LOOP
IF _name = $2 THEN
_value := $3;
ELSE
EXECUTE 'SELECT (($1).' || quote_ident(_name) || ')::text' INTO _value USING $1;
END IF;
_values[_attnum] := COALESCE('"' || replace(replace(_value, '"', '""'), '''', '''''') || '"', '');
END LOOP;
EXECUTE 'SELECT (' || quote_ident(pg_typeof($1)::text) || ' ''(' || array_to_string(_values,',') || ')'').*' INTO $1;
RETURN $1;
END;
$function$;
UPDATE/caution: Erwin points out that this is currently undocumented, and the manual indicates it should not be possible to alter records this way.
Use hstore or Pavel's solution instead.
This simple json based solution is almost as fast as hstore, and requires only Postgres 9.3 or newer. This should be a good option if you can't use the hstore extension, and the performance difference should be negligible. Benchmarks: https://stackoverflow.com/a/28673542/1914376
a) We can either do it inline by cast/concat. Json function requires Postgres 9.3:
SELECT json_populate_record(
record
, ('{"'||'key'||'":"'||'new-value'||'"}')::json
);
b) or inline by using functions from Postgres 9.4.
SELECT json_populate_record (
record
,json_object(ARRAY['key', 'new-value'])
);
Note: I chose json_object(ARRAY[key,value]) since it was a bit faster than json_build_object(key,value):
To hide the casting details you can use a) in a function, with little overhead.
CREATE FUNCTION x.setfield_json(in_element anyelement, key text, value text)
RETURNS anyelement AS
$BODY$
SELECT json_populate_record( in_element, ('{"'||key||'":"'||value||'"}')::json);
$BODY$ LANGUAGE sql;
"SELECT INTO" outside plpgsql (in dynamic SQL context) has different sense than you expect - it store a result of query to table.
Modification of any field is possible, but not simple
CREATE OR REPLACE FUNCTION public.setfield(a anyelement, text, text)
RETURNS anyelement
LANGUAGE plpgsql
AS $function$
begin
create temp table aux as select $1.*;
execute 'update aux set ' || quote_ident($2) || ' = ' || quote_literal($3);
select into $1 * from aux;
drop table aux;
return $1;
end;
$function$
But this code is not very effective - is not possible to write this well in plpgsql. You can find some C library, that should to do.
Test setup and benchmarks v2
Erwin encouraged to reproduce his benchmark in this thread (https://stackoverflow.com/a/7782839/1914376), so I modified his code with synthetic test data and added both the hstore solution and the json-solution from my answer (and a json solution by Pavel found in another thread) The benchmark is now run as one query, making it easier to capture the results.
DROP SCHEMA IF EXISTS x CASCADE;
CREATE SCHEMA x;
-- Pavel 1:
--------------------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION x.setfield(anyelement, text, text)
RETURNS anyelement
LANGUAGE plpgsql
AS $function$
begin
create temp table aux as select $1.*;
execute 'update aux set ' || quote_ident($2) || ' = ' || quote_literal($3);
select into $1 * from aux;
drop table aux;
return $1;
end;
$function$;
-- Pavel 2 (with patches)
--------------------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION x.setfield2(anyelement, text, text)
RETURNS anyelement
LANGUAGE plpgsql
AS $function$
DECLARE
_name text;
_values text[];
_value text;
_attnum int;
BEGIN
FOR _name, _attnum
IN SELECT a.attname, a.attnum
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = (SELECT typrelid
FROM pg_type
WHERE oid = pg_typeof($1)::oid)
LOOP
IF _name = $2 THEN
_value := $3;
ELSE
EXECUTE 'SELECT (($1).' || quote_ident(_name) || ')::text' INTO _value USING $1;
END IF;
_values[_attnum] := COALESCE('"' || replace(replace(_value, '"', '""'), '''', '''''') || '"', '');
END LOOP;
EXECUTE 'SELECT (' || pg_typeof($1)::text || '''(' || array_to_string(_values,',') || ')'').*' INTO $1;
RETURN $1;
END;
$function$;
-- Erwin 1
--------------------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION x.setfield3(anyelement, text, text)
RETURNS anyelement
AS $body$
DECLARE
_list text;
BEGIN
_list := (
SELECT string_agg(x.fld, ',')
FROM (
SELECT CASE WHEN a.attname = $2
THEN quote_literal($3)
ELSE quote_ident(a.attname)
END AS fld
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = (SELECT typrelid
FROM pg_type
WHERE oid = pg_typeof($1)::oid)
ORDER BY a.attnum
) x
);
EXECUTE '
SELECT ' || _list || '
FROM (SELECT $1.*) x'
USING $1
INTO $1;
RETURN $1;
END;
$body$ LANGUAGE plpgsql;
-- Erwin 2
--------------------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION x.setfield4(INOUT _comp_val anyelement
, _field text, _val text)
RETURNS anyelement AS
$func$
BEGIN
EXECUTE 'SELECT ' || array_to_string(ARRAY(
SELECT CASE WHEN attname = _field
THEN '$2'
ELSE '($1).' || quote_ident(attname)
END AS fld
FROM pg_catalog.pg_attribute
WHERE attrelid = pg_typeof(_comp_val)::text::regclass
AND attnum > 0
AND attisdropped = FALSE
ORDER BY attnum
), ',')
USING _comp_val, _val
INTO _comp_val;
END
$func$ LANGUAGE plpgsql;
-- Pavel 3: json. (Postgres 9.4)
-- Found here: https://stackoverflow.com/a/28284491/1914376
--------------------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION x.setfield5(r anyelement, fn text, val text,OUT result anyelement)
RETURNS anyelement
LANGUAGE plpgsql
AS $function$
declare jo json;
begin
jo := (select json_object(array_agg(key),
array_agg(case key when fn then val
else value end))
from json_each_text(row_to_json(r)));
result := json_populate_record(r, jo);
end;
$function$;
-- Json. Use built-in json functions (Postgres 9.3)
-- This is available from 9.3 since we create json by casting
-- instead of using json_object/json_build_object only available from 9.4
--------------------------------------------------------------------------------------------------
CREATE FUNCTION x.setfield_json(in_element anyelement, key text, value text)
RETURNS anyelement AS
$BODY$
SELECT json_populate_record( in_element, ('{"'||key||'":"'||value||'"}')::json);
$BODY$ LANGUAGE sql;
--------------------------------------------------------------------------------------------------
-- Test setup
--------------------------------------------------------------------------------------------------
-- composite type for tests.
CREATE TYPE x.t_f as (
id int
,company text
,sort text
,log_up timestamp
,log_upby smallint
);
-- Create temp table with synthetic test data
DROP TABLE IF EXISTS tmp_f;
CREATE TEMP table tmp_f AS
SELECT ROW(i, 'company'||i, NULL, NULL, NULL)::x.t_f AS f
FROM generate_series(1, 5000) S(i);
-- Run the benchmark
DO $$ DECLARE start_time timestamptz; test_count integer; test_description TEXT; BEGIN
test_count := 200;
test_description := 'setfield, Pavel 1: temptable';
start_time := clock_timestamp();
PERFORM x.setfield (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count;
RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;
test_count := 5000;
test_description := 'setfield2, Pavel 2: reflection';
start_time := clock_timestamp();
PERFORM x.setfield2 (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count;
RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;
test_count := 5000;
test_description := 'setfield3, Erwin 1: reflection';
start_time := clock_timestamp();
PERFORM x.setfield3 (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count;
RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;
test_count := 5000;
test_description := 'setfield4, Erwin 2: reflection';
start_time := clock_timestamp();
PERFORM x.setfield4 (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count;
RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;
test_count := 5000;
test_description := 'setfield5, Pavel 3: json (PG 9.4)';
start_time := clock_timestamp();
PERFORM x.setfield5 (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count;
RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;
test_count := 5000;
test_description := 'setfield_json, Geir 1: casting (PG 9.3)';
start_time := clock_timestamp();
PERFORM x.setfield_json (f, 'company','new-value-'||md5(random()::text)) FROM tmp_f LIMIT test_count;
RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;
--json_object(ARRAY(key,value]) is actually faster than json_build_object(key, value)
test_count := 5000;
test_description := 'no function/inlined: json_object (PG 9.4)';
start_time := clock_timestamp();
PERFORM json_populate_record( f, json_object(ARRAY['company', 'new-value'||md5(random()::text)] )) FROM tmp_f LIMIT test_count;
RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;
test_count := 5000;
test_description := 'no function/inlined: hstore (PG 9.0)';
start_time := clock_timestamp();
PERFORM f #= hstore('company', 'new-value'||md5(random()::text)) FROM tmp_f LIMIT test_count;
RAISE NOTICE 'Test took: % ms (for % rows) Name: %', extract(MILLISECONDS FROM (clock_timestamp() - start_time))::INTEGER, test_count, test_description;
END; $$;
Test results on 9.4.1, win32, i5-4300U
NOTICE: Test took: 1138 ms (for 200 rows) Name: setfield, Pavel 1: temptable
NOTICE: Test took: 652 ms (for 5000 rows) Name: setfield2, Pavel 2: reflection
NOTICE: Test took: 364 ms (for 5000 rows) Name: setfield3, Erwin 1: reflection
NOTICE: Test took: 275 ms (for 5000 rows) Name: setfield4, Erwin 2: reflection
NOTICE: Test took: 192 ms (for 5000 rows) Name: setfield5, Pavel 3: json (PG 9.4)
NOTICE: Test took: 23 ms (for 5000 rows) Name: setfield_json, Geir 1: casting (PG 9.3)
NOTICE: Test took: 25 ms (for 5000 rows) Name: no function/inlined: json_object (PG 9.4)
NOTICE: Test took: 14 ms (for 5000 rows) Name: no function/inlined: hstore (PG 9.0)
Update March 2015:
Largely outdated now. Consider the new benchmark by @Geir with faster variants.
Test setup and benchmarks
I took the three solutions presented (by Oct. 16th, 2011) and ran a test on PostgreSQL 9.0. You find the complete setup below. Only test data are not included as I used a real life database (not synthetic data). It's all encapsulated in its own schema for non-intrusive use.
I would like to encourage anybody who wants to reproduce the test. Maybe with postgres 9.1? And add your results here? :)
-- DROP SCHEMA x CASCADE;
CREATE SCHEMA x;
-- Pavel 1
CREATE OR REPLACE FUNCTION x.setfield(anyelement, text, text)
RETURNS anyelement
LANGUAGE plpgsql
AS $function$
begin
create temp table aux as select $1.*;
execute 'update aux set ' || quote_ident($2) || ' = ' || quote_literal($3);
select into $1 * from aux;
drop table aux;
return $1;
end;
$function$;
-- Pavel 2 (with patches)
CREATE OR REPLACE FUNCTION x.setfield2(anyelement, text, text)
RETURNS anyelement
LANGUAGE plpgsql
AS $function$
DECLARE
_name text;
_values text[];
_value text;
_attnum int;
BEGIN
FOR _name, _attnum
IN SELECT a.attname, a.attnum
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = (SELECT typrelid
FROM pg_type
WHERE oid = pg_typeof($1)::oid)
LOOP
IF _name = $2 THEN
_value := $3;
ELSE
EXECUTE 'SELECT (($1).' || quote_ident(_name) || ')::text' INTO _value USING $1;
END IF;
_values[_attnum] := COALESCE('"' || replace(replace(_value, '"', '""'), '''', '''''') || '"', '');
END LOOP;
EXECUTE 'SELECT (' || pg_typeof($1)::text || '''(' || array_to_string(_values,',') || ')'').*' INTO $1;
RETURN $1;
END;
$function$;
-- Erwin 1
CREATE OR REPLACE FUNCTION x.setfield3(anyelement, text, text)
RETURNS anyelement
AS $body$
DECLARE
_list text;
BEGIN
_list := (
SELECT string_agg(x.fld, ',')
FROM (
SELECT CASE WHEN a.attname = $2
THEN quote_literal($3)
ELSE quote_ident(a.attname)
END AS fld
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = (SELECT typrelid
FROM pg_type
WHERE oid = pg_typeof($1)::oid)
ORDER BY a.attnum
) x
);
EXECUTE '
SELECT ' || _list || '
FROM (SELECT $1.*) x'
USING $1
INTO $1;
RETURN $1;
END;
$body$ LANGUAGE plpgsql;
-- composite type for tests.
CREATE TYPE x.t_f as (
id int
,company text
,sort text
,log_up timestamp
,log_upby smallint
);
-- temp table with real life test data
DROP TABLE IF EXISTS tmp_f;
CREATE TEMP table tmp_f AS
SELECT ROW(firma_id,firma,sort,log_up,log_upby)::x.t_f AS f
FROM ef.firma
WHERE firma !~~ '"%';
-- SELECT count(*) FROM tmp_f; -- 5183
-- Quick test: results are identical?
SELECT *,
x.setfield (f, 'company','test')
,x.setfield2(f, 'company','test')
,x.setfield3(f, 'company','test')
FROM tmp_f
LIMIT 10;
Benchmarks
I ran the queries a couple of times to populate the cache. The presented results are the best of five total runtimes with EXPLAIN ANALYZE
.
Rirst round with 1000 rows
Pavel's first prototype maxes out shared memory with more rows.
Pavel 1: 2445.112 ms
SELECT x.setfield (f, 'company','test') FROM tmp_f limit 1000;
Pavel 2: 263.753 ms
SELECT x.setfield2(f, 'company','test') FROM tmp_f limit 1000;
Erwin 1: 120.671 ms
SELECT x.setfield3(f, 'company','test') FROM tmp_f limit 1000;
Another test with 5183 rows.
Pavel 2: 1327.429 ms
SELECT x.setfield2(f, 'company','test') FROM tmp_f;
Erwin1: 588.691 ms
SELECT x.setfield3(f, 'company','test') FROM tmp_f;
精彩评论