splitting sentences using regex
i would like to split a text (using a regex开发者_运维知识库) to a dot followed by a whitespace or a dot followed by new line (\n)
i'm working with c# .Net
Appreciate your answers!
using System.Text.RegularExpressions;
string[] parts = Regex.Split(mytext, "\.\n|\. ");
# or "\.\s" if you're not picky about it matching tabs, etc.
The regular expression
/\.\s/
Will match a literal .
followed by whitespace.
You don't need a regular expression for that. Just use the overload of string.Split
that takes an array of strings:
string[] splitters = new string[] { ". ", ".\t", "." + Environment.NewLine };
string[] sentences = aText.Split(splitters, StringSplitOptions.None);
精彩评论