开发者

Tokenize a string with delim of strings

If I have a string 开发者_StackOverflowlike

"This is a string that will be split by this and that"

I would like to get the split results as

  1. "is a string that will be split by"
  2. "and that"
  3. "this is a string"
  4. "will be split by this and"

1 and 2 are split by "this" 3 and 4 are split by "that"

My solution is use a map of string to string and store the result in another map of the same type-string to string. However, for more complex and longer text, the results stored in the map become repeated, i.e as in the above 1 and 3 the substring "is a string" is repeated and this redundancy produces incorrect statistical results.

Would you please offer a neat better solution to tokenizing a long string with delimiters that are different long strings?


string myString = "This is a string that will be splitted by this and that";
string foo = myString.ToUpper();

string[] byThis = foo.Split(new string[] { "THIS" }, StringSplitOptions.RemoveEmptyEntries);
string[] byThat = foo.Split(new string[] { "THAT" }, StringSplitOptions.RemoveEmptyEntries);

string[] all = foo.Split(new string[] { "THAT", "THIS" }, StringSplitOptions.RemoveEmptyEntries);

Or you can use Regex for that

string[] all = System.Text.RegularExpressions.Regex.Split(myString, "your pattern", System.Text.RegularExpressions.RegexOptions.IgnoreCase);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜