How to make this function not prematurely split?
I've written this function...
internal static IEnumerable<KeyValuePair<char?, string>> SplitUnescaped(this string input, char[] separators)
{
int index = 0;
var state = new Stack<char>();
for (int i = 0; i < input.Length; ++i)
{
char c = input[i];
char s = state.Count > 0 ? state.Peek() : default(char);
if (state.Count > 0 && (s == '\\' || (s == '[' && c == ']') || ((s == '"' || s == '\'') && c == s)))
state.Pop();
else if (c == '\\' || c == '[' || c == '"' || c == '\'')
state.Push(c);
if (state.Count == 0 && separators.Contains(c))
{
yield return new KeyValuePair<char?, string>(c, input.Substring(index, i - index));
index = i + 1;
}
}
yield return new KeyValuePair<char?, string>(null, input.Substring(index));
}
Which splits a string on the given separators, as long as they aren't escaped, in quotes, or in brackets. Seems to work pretty well, but there's one problem with it.
There characters I want to split on include a space:
{ '>', '+', '~', ' ' };
So, given the string
开发者_如何学Goa > b
I want it to split on >
and ignore the spaces, but given
a b
I do want it to split on the space.
How can I fix the function?
You could continue to split based on and
>
and then remove the strings which are empty.
I think this does it...
internal static IEnumerable<KeyValuePair<char?, string>> SplitUnescaped(this string input, char[] separators)
{
int startIndex = 0;
var state = new Stack<char>();
input = input.Trim(separators);
for (int i = 0; i < input.Length; ++i)
{
char c = input[i];
char s = state.Count > 0 ? state.Peek() : default(char);
if (state.Count > 0 && (s == '\\' || (s == '[' && c == ']') || ((s == '"' || s == '\'') && c == s)))
state.Pop();
else if (c == '\\' || c == '[' || c == '"' || c == '\'')
state.Push(c);
else if (state.Count == 0 && separators.Contains(c))
{
int endIndex = i;
while (input[i] == ' ' && separators.Contains(input[i + 1])) { ++i; }
yield return new KeyValuePair<char?, string>(input[i], input.Substring(startIndex, endIndex - startIndex));
while (input[++i] == ' ') { }
startIndex = i;
}
}
yield return new KeyValuePair<char?, string>(null, input.Substring(startIndex));
}
I was trying to push the space onto the stack too before, and then doing some checks against that...but I think this is easier.
精彩评论