get url from string [duplicate]
Possible Duplicate:
Get a URL from a String
Hi, im trying to extract a url from a string using regexp. the string is something like: "lorem ipsum baby www.test.com lorem", "lorem ipsum http://www.test.com foo bar" or "lorem www.test.com" with no trailing whitespace.
usingMatchCollection ms = Regex.Matches(adress, @"(www.+|http.+)([\s]|$)");
returns the entire string. Could any regexp-guru help me out on this one?
开发者_开发百科Edit:
Solved it this way:MatchCollection mc = Regex.Matches(adress, @"(www[^ \s]+|http[^ \s]+)([\s]|$)", RegexOptions.IgnoreCase);
adress = mc[0].Value;
WebBrowserTask task = new WebBrowserTask();
task.URL = adress;
task.Show();
Thank you all for your help! :)
I think we are missing the obvious here that there is actually nothing wrong with this code.
Perhaps the OP is not calling the match.value correctly.
string adress = "hello www.google.ca";
// Size the control to fill the form with a margin
MatchCollection ms = Regex.Matches(adress, @"(www.+|http.+)([\s]|$)");
string testMatch = ms[0].Value.ToString();
testMatch only contains "www.google.ca"
Isn't this your intention newa?
Try something like this:
string txt = "lorem ipsum baby http:\\\\www.google.com\/";
Regex regx = new Regex("http://([\\w+?\\.\\w+])+([a-zA-Z0-9\\~\\!\\@\\#\\$\\%\\^\\&\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?",
RegexOptions.IgnoreCase);
MatchCollection ms = regx.Matches(txt);
I think the problem is that the "." identifier matches anything, including those trailing spaces you want to end the capture at. If you change the ".+" to "[^ ]+", or make the first capture "nongreedy" by putting a "?:" just inside the opening parenthesis, you should get the answer you want.
精彩评论