Postgres regular expressions and regexp_split_to_array

2023-02-17 20:16 问答作者：

In postgresql, I need to extract the first two words in the value for a given column. So if the value is "hello world moon and stars" or "hello world moon" or even just "hello world", I need "hello world".

I was hoping to use regexp_split_to_array but it doesn't seem that I can use this and access the elements returned in the same query?

Do I need to create a funct开发者_如何学JAVAion for what I'm trying to do?

I can't believe that 5 years ago and no one noticed that you can access elements from regexp_split_to_array function if you surround them with parenthesis.

I saw many people tried to access the elements of the table like this:

select regexp_split_to_array(my_field, E'my_pattern')[1] from my_table

The previous will return an error, but the following will not :

select (regexp_split_to_array(my_field, E'my_pattern'))[1] from my_table

You can use POSIX regular expressions with PostgreSQL's substring():

select substring('hello world moon' from E'^\\w+\\s+\\w+');

Or with a very liberal interpretation of what a word is:

select substring('it''s a nice day' from E'^\\S+\\s+\\S+');

Note the \S (non-whitespace) instead of \w ("word" character, essentially alphanumeric plus underscore).

Don't forget all the extra quoting nonsense though:

The E'' to tell PostgreSQL that you're using extending escaping.
And then double backslashes to get single backslashes past the string parser and in to the regular expression parser.

If you really want to use regexp_split_to_array, then you can but the above quoting issues apply and I think you'd want to slice off just the first two elements of the array:

select (regexp_split_to_array('hello world moon', E'\\s+'))[1:2];

I'd guess that the escaping was causing some confusion; I usually end up adding backslashes until it works and then I pick it apart until I understand why I needed the number of backslashes that I ended up using. Or maybe the extra parentheses and array slicing syntax was an issue (it was for me but a bit of experimentation sorted it out).

found one answer:

select split_part('hello world moon', ' ', 1) || ' ' || split_part('hello world moon', ' ', 2);

select substring(my_text from $$^\S+\s+\S+$$) from v;

  substring
-------------
 hello world
 hello world
 hello world
(3 rows)

where for the purpose of demonstration, v is:

create view v as select 'hello world moon and stars' as my_text union all 
                 select 'hello world mood' union all 
                 select 'hello world';

if you want to ignore whitespace at the beginning:

select substring(my_text from $$^\s*\S+\s+\S+$$) from v;

继续阅读：postgresql regex

Postgres regular expressions and regexp_split_to_array

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？