Regex for non ANSI-style joins?
One of our rules is that all database joins have to be ANSI-style. As part of our build process I would like to scan all joins committed to source contro开发者_C百科l for violations.
I got a partial answer which I will post below but I'm sure its missing something and there must be a better one.
Here is a non-comprehensive list of examples
Should match:
SELECT *
FROM cats, owners, dogs
WHERE cats.owner = onwers.id and owners.id += dogs.owner;
SELECT * FROM cats c, owners o WHERE c.owner = o.id;
Should not match:
SELECT *
FROM owners left outer join cats on (owners.id = cats.owner);
SELECT *
FROM owners inner join cats on (cats.owners = GetOnersOfTabbies(param1, parm2))
Well, here I go:
FROM [\s\S]*?,[\s\S]*?WHERE
Will match a ,
between FROM
and WHERE
clauses even inside a multi line query.
[\s\S]*?
means: take smallest possible match for whitespace characters (including line-breaks) and non-whitespaces; this is also known as "non-greedy pattern".
Here's my regex:
from\s+\w+(\s+)?(\w+)?,(\s+)?\w+
Ruben's answer didn't work for me in my regex tester.
\s == space character
\s+ == 1+ space character(s)
(\s+)? == Within the brackets is optional, reqires a "?" immediately after the ")"
\w == word character, alphanumeric
To catch SELECT * FROM (cats c, owners o) WHERE c.owner = o.id;
, use:
from\s+(\()?\w+(\s+)?(\w+)?,(\s+)?\w+
You need to escape the bracket, using the \
character:
(\()? == '(' character
Here is my attempt:
FROM (\s*\S+\s*(\w\s*)*,)+
Should pick up the first comma in a from clause - these are necessary for old style joins.
精彩评论