Regex for a simplified HQL syntax
I intended to provide my view/business layer with the possibility to send HQL queries in strings to my data layer.
However, the data layer needs to analyze and manipulate these queries (in particular, add a criterion in the where clause).
The supported forms of HQL queries are any combination of the following:
from ...
where ...
order by ...
I think this kind of simplified HQL query should be regex-able, and this is the Regex I defined:
public const string HqlRegex = @"^(\bfrom\b\s*(?<FromPart>.+)\s*)?"
+ @"(\bwhere\b\s*(?<WherePart>.+)\s*)?"
+ @"(\border\b\s+\bby\b\s*(?<OrderByPart>.+))?$";
开发者_如何转开发
Regex re = new Regex(Novartis.MapAdmeBs.NHibernateDAO.DAOFactory.HqlRegex,
RegexOptions.Singleline);
Update: I've even tried with the non-greedy modifier:
public const string HqlRegex = @"^(\bfrom\b\s*(?<FromPart>.+?)\s*)?"
+ @"(\bwhere\b\s*(?<WherePart>.+?)\s*)?"
+ @"(\border\b\s+\bby\b\s*(?<OrderByPart>.+?))?$";
However, when I try to match a string containing the where and the order by clause, the "order by" keyword is regarded as part of the where clause:
Match m = re.Match("where (e.Name not like '%X') and e.StatusID not in (7, 8, 9)"
+ " order by e.DateCreated desc");
Console.WriteLine(m.Groups["WherePart"].Value);
gives
(e.Name not like '%X') and e.StatusID not in (7, 8, 9) order by e.DateCreated desc
Any of the following help is appreciated:
- How to fix the regexp?
- Is there a regexp for HQL? (Googling leads to regexp-features of the HQL language)
- Better idea which is still simple enough to implement in a day or less?
There isn't a regular expression for HQL because HQL isn't a regular language.
To make a quick fix you could use the non-greedy modifier ?
for each .+
:
string hqlRegex = @"^(\bfrom\b\s*(?<FromPart>.+?)\s*)?"
+ @"(\bwhere\b\s*(?<WherePart>.+?)\s*)?"
+ @"(\border\b\s+\bby\b\s*(?<OrderByPart>.+?))?$";
Regex re = new Regex(hqlRegex);
Match m = re.Match("where (e.Name not like '%X') and e.StatusID not in (7, 8, 9)"
+ " order by e.DateCreated desc");
string wherePart = m.Groups["WherePart"].Value;
Console.WriteLine(wherePart);
Result:
(e.Name not like '%X') and e.StatusID not in (7, 8, 9)
I'd also like to mention that it is pointless to include the a word-boundary in for example order\b\s+\bby
. There must always be a word boundary between r and a whitespace character so there is no need to specify it.
If you wish to do this with ANTLR then you could look at the article Working with ANTLR: HQL Grammar.
I'm not sure the regex are the best tool for that. I'd rather try ANTLR and define a small grammar to parse your HQL-lite queries.
I would not try to use Regex for anything as complex as HQL because HQL isn't a regular language.
See When not to use Regex in C# (or Java, C++, etc.)
As you can in control of both ends, I would consider representing your query as a tree of objects, that you then convert to/from xml or json. That way you will not have to write any string parsing code yourself.
Otherwise, someone somewhere must have written a parser for HQL queries, look at the source code for any open source projects that use HQL.
精彩评论