Best way to split a string by word (SQL Batch separator)
I have a class I use to "split" a string of SQL commands by a batch separator - e.g. "GO" - into a list of SQL commands that are run in turn etc.
...
private static IEnumerable<string> SplitByBatchIndecator(string script, string batchIndicator)
{
string pattern = string.Concat("^\\s*", batchIndicator, "\\s*$");
RegexOptions options = RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.Multiline;
foreach (string batch in Regex.Split(script, pattern, options))
{
yield return batch.Trim();
}
}
My current implementation uses a Regex
with yield
but I am not sure if it's the "best" way.
- It should be quick
- It should开发者_开发技巧 handle large strings (I have some scripts that are 10mb in size for example)
- The hardest part (that the above code currently does not do) is to take quoted text into account
Currently the following SQL will incorrectly get split:
var batch = QueryBatch.Parse(@"-- issue...
insert into table (name, desc)
values('foo', 'if the
go
is on a line by itself we have a problem...')");
Assert.That(batch.Queries.Count, Is.EqualTo(1), "This fails for now...");
I have thought about a token based parser that tracks the state of the open closed quotes but am not sure if Regex will do it.
Any ideas!?
You can track the opening and closing quotes using a Balancing Group Definition.
Also, a similar question was asked last year about splitting on whitespace as long as the whitespace wasn't contained in quotes. You might be able to adjust those answers to get where you're going.
精彩评论