Nested regular expression replacement in python
I'm migrating SQL from one platform to another. The SQL contains DECODE statements, which are not supported by my target platform.
I'm using regular expressions to translate the decode statements to case statements but I'm failing at nested decodes:
import re
sql_frag = "select decode(dim1,'','-',dim1) as myfield1, decode(dim2,'','-',dim2') as myfield2"
reg=re.compile("(decode\((.*?),(.*?),(.*?),(.*?)\))",re.IGNORECASE)
matches = reg.findall(sql_frag)
for match in matches:
sql_frag = sql_frag.replace(match[0],'case when %s = %s then %s else %s end' % (match[1],match[2],match[3],match[4]))
will match all occurrences of decode in
sele开发者_开发百科ct decode(dim1,'','-',dim1) as myfield1, decode(dim2,'','-',dim2') as myfield2
and will replace with case statements:
select case when dim1 = '' then '-' else dim1 end as myfield1, case when dim2 = '' then '-' else dim2' end as myfield2
But the code trips up on nested decode statements:
sql_frag="select decode(f1,3,4,decode(f2,5,4,f2)) as myfield, decode(foo,bar,baz,foo) as myfield2"
>>> reg.findall(sql_frag)
[('decode(f1,3,4,decode(f2,5,4,f2)', 'f1', '3', '4', 'decode(f2,5,4,f2'), ('decode(foo,bar,baz,foo)', 'foo', 'bar', 'baz', 'foo')
and returns
select case when f1 = 3 then 4 else decode(f2,5,4,f2 end) as myfield, case when foo = bar then baz else foo end as myfield2
Is there a way to process the innermost decode before the others so that I can cleanly replace all of the decode statements with case statements?
Is there a way to process the innermost decode before the others so that I can cleanly replace all of the decode statements with case statements?
Yes.
Use ((?![^)]*decode).*?)
in place of any (.*?)
where a nested DECODE
can legally occur. Run the regex in a loop.
Be aware that this can fail if there are strings (i.e. "THEN
values") that contain the word "decode". If you know your data and can rule out this case, the regex approach above will be good-enough for your one-off use case.
精彩评论