PostgreSQL string character replacement
I'm trying to write a lexical database for storing words comprised of roots and patterns, and I was wondering how I could create a column that will combine the root and pattern for me, while ignor开发者_Python百科ing rows which don't have both columns of the SELECT
query populated.
Basically, I have this output from a PostgreSQL DB:
SELECT root, root_i FROM tbl_roots NATURAL JOIN tbl_patterns NATURAL JOIN tbl_patterns_triliteral;
root | root_i
---------+--------
{s,ş,m} | 1u2u3a
{p,l,t} | 1u2u3a
{t,m,s} | 1u2u3a
{n,t,l} | 1u2u3a
{s,ş,m} | 1a2oi3
{p,l,t} | 1a2oi3
{t,m,s} | 1a2oi3
{n,t,l} | 1a2oi3
{s,ş,m} | 1o2i3
{p,l,t} | 1o2i3
{t,m,s} | 1o2i3
{n,t,l} | 1o2i3
{s,ş,m} | a12e3
{p,l,t} | a12e3
{t,m,s} | a12e3
{n,t,l} | a12e3
{s,ş,m} | 1u2á3
{p,l,t} | 1u2á3
{t,m,s} | 1u2á3
{n,t,l} | 1u2á3
{s,ş,m} |
{p,l,t} |
{t,m,s} |
{n,t,l} |
{s,ş,m} | 1e2é3
{p,l,t} | 1e2é3
{t,m,s} | 1e2é3
{n,t,l} | 1e2é3
{s,ş,m} |
{p,l,t} |
{t,m,s} |
{n,t,l} |
{s,ş,m} |
{p,l,t} |
{t,m,s} |
{n,t,l} |
{s,ş,m} |
{p,l,t} |
{t,m,s} |
{n,t,l} |
And I want to convert it on-the-fly into something resembling this:
root | root_i | word_i
---------+--------+--------
{s,ş,m} | 1u2u3a | suşuma
{p,l,t} | 1u2u3a | puluta
{t,m,s} | 1u2u3a | tumusa
{n,t,l} | 1u2u3a | nutula
{s,ş,m} | 1a2oi3 | saşoim
{p,l,t} | 1a2oi3 | paloit
{t,m,s} | 1a2oi3 | tamois
{n,t,l} | 1a2oi3 | natoil
{s,ş,m} | 1o2i3 | soşim
{p,l,t} | 1o2i3 | polit
{t,m,s} | 1o2i3 | tomis
{n,t,l} | 1o2i3 | notil
{s,ş,m} | a12e3 | asşem
{p,l,t} | a12e3 | aplet
{t,m,s} | a12e3 | atmes
{n,t,l} | a12e3 | antel
{s,ş,m} | 1u2á3 | suşám
{p,l,t} | 1u2á3 | pulát
{t,m,s} | 1u2á3 | tumás
{n,t,l} | 1u2á3 | nutál
{s,ş,m} | 1e2é3 | seşém
{p,l,t} | 1e2é3 | pelét
{t,m,s} | 1e2é3 | temés
{n,t,l} | 1e2é3 | neşél
Where the word
column is dynamically generated by replacing the digits in the root_i
column with the character in that digit's index in the root
column. I also need to drop queried rows that don't have an entry in both columns to reduce clutter in my output.
Can anyone help me devise a postgres function that will do the merging of character[] and text strings? The little bit of regex I need shouldn't be complicated, but I have no idea how to mix this with a query, or better yet, turn it into a function.
select
root,
root_i,
translate(root_i, "123", array_to_string(root,'')) as word_i
NATURAL JOIN tbl_patterns
NATURAL JOIN tbl_patterns_triliteral
where root is not null and root_i is not null;
I must admit to not liking to do much string manipulation in sql/plpgsql functions. Perl has an operator for replacing regexp matches with generated replacements, which works fairly nicely:
create or replace function splice_to_word(root text, root_i text)
returns text strict immutable language plperl as $$
my $roots = shift;
my $template = shift;
$template =~ s{(\d+)}{substr($roots,$1-1,1)}ge;
return $template;
$$;
There is some nastiness in that postgresql arrays don't seem to be translated into Perl lists, so I've assumed the roots are passed in as a string, e.g.:
select root, root_i, splice_to_word(array_to_string(root, ''), root_i) from data
精彩评论