开发者

Best way to split a string by word (SQL Batch separator)

I have a class I use to "split" a string of SQL commands by a batch separator - e.g. "GO" - into a list of SQL commands that are run in turn etc.

...
private static IEnumerable<string> SplitByBatchIndecator(string script, string batchIndicator)
{
    string pattern = string.Concat("^\\s*", batchIndicator, "\\s*$");
    RegexOptions options = RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.Multiline;
    foreach (string batch in Regex.Split(script, pattern, options))
    {
        yield return batch.Trim();
    }
}

My current implementation uses a Regex with yield but I am not sure if it's the "best" way.

  • It should be quick
  • It should开发者_开发技巧 handle large strings (I have some scripts that are 10mb in size for example)
  • The hardest part (that the above code currently does not do) is to take quoted text into account

Currently the following SQL will incorrectly get split:

var batch = QueryBatch.Parse(@"-- issue...
insert into table (name, desc)
values('foo', 'if the
go
is on a line by itself we have a problem...')");

Assert.That(batch.Queries.Count, Is.EqualTo(1), "This fails for now...");

I have thought about a token based parser that tracks the state of the open closed quotes but am not sure if Regex will do it.

Any ideas!?


You can track the opening and closing quotes using a Balancing Group Definition.

Also, a similar question was asked last year about splitting on whitespace as long as the whitespace wasn't contained in quotes. You might be able to adjust those answers to get where you're going.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜