RegExp help for converting hyperlinks
I am trying to come up with a regexp and have tried many combinations and searching to find a solution to convert non-hyperlinked addresses to hyperlinks.
ie
http://twitpic.com/abcdef http://www.smh.com.au askjhsd www.hotmail.com ks sd
<a href="http://www.aaaaaaaa.com">aaaaaaaa</a>
I want the http://twitpic.com/abcdef
, http://www.smh.com.au
and www.hotmail.com
to be picked up but not the http://www.aaaaaaaa.com
as it is wrapped around an <a>
tag already.
I am currently using this regexp in C#
return Regex.Replace(input, @"(\b((http|https)://|www\.)[^ ]+\b)",
@" <a href=""$0"" target=""_blank"">$0</a>", RegexOptions.IgnoreCase);
I have no idea how to make it exclude stuff already wrapped in <a>
or <img>
Help :)
EDIT
For those reading this later, this is the final solution I came up with
/// <summary>
/// Adds to the input string a target=_blank in the hyperlinks
/// </summary>
public static string ConvertURLsToHyperlinks(string input)
{
if (!string.IsNullOrEmpty(input))
{
var reg = new Regex(@"(?<!<\s*(?:a|img)\b[^<]*)(\b((http|https)://|www\.)[^ ]+\b)");
return reg.Replace(input, new MatchEvaluator(ConvertUrlsMatchDelegate));
}
return input;
}
public static string ConvertUrlsMatchDelegate(Match m)
{
// add in additional http:// in front of the www. for the hyperlinks
var additional = "";
if (m.Value.StartsWith("www."))
{
additional = "http://";
}
return "&l开发者_如何转开发t;a href=\"" + additional + m.Value + "\" target=\"_blank\">" + m.Value + "</a>";
}
You could use
@"(?<!<\s*(?:a|img)\b[^<]*)(\b((http|https)://|www\.)[^ ]+\b)"
as your regex. The negative lookbehind assertion.
The lookbehind assertion explained:
(?<! # Assert that it's impossible to match before the current position:...
< # a <
\s* # optional whitespace
(?:a|img) # a or img
\b # as an entire word
[^<]* # followed by any number of characters except <
) # end of lookbehind
精彩评论