开发者

C# RegEx on a StreamReader will not return matches

I'm writing myself a simple screen scraping applicati开发者_JS百科on to play around with the HTMLAgilityPack library, and after getting it to work on several different types of HtmlNodes, I figured I'd get fancy and throw in a Regex for Email addresses as well. The only problem is that the application never finds any matches, or maybe it is but not returning properly. This takes place even on sites known to contain email addresses. Can anyone spot what I'm doing wrong here?

      string url = String.Format("http://{0}", mainForm.Target);
      string reg = "\b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}\b";
      try
            {
                WebClient wClient = new WebClient();
                Stream data = wClient.OpenRead(url);
                StreamReader read = new StreamReader(data);
                MatchCollection matches = Regex.Matches(read.ReadToEnd(), reg, RegexOptions.IgnoreCase|RegexOptions.Multiline);
                foreach (Match match in matches)
                {
                    textBox1.AppendText(match.ToString() + Environment.NewLine);
                }


Use raw strings:

string reg = @"\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b";

Without that, \b becomes backspace. Also, your last period should be \., so it only matches a literal period.


Check the string that is returned by read.ReadToEnd() and see if you can find email addresses in this string with your regex. I guess that your problem doesn't have anything to do with StreamReader.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜