What's my best bet for replacing plain text links with anchor tags in a string? .NET
What is my best option for converting plain text links within a string into anchor tags?
Say for example I have "I went and searched on ht开发者_如何转开发tp://www.google.com/ today". I would want to change that to "I went and searched on http://www.google.com/ today".
The method will need to be safe from any kind of XSS attack also since the strings are user generated. They will be safe before parsing so I just need to make sure that no vulnerabilities are introduced through parsing the URLs.
A simple regular expression could get you what you want, since you say that the strings will be safe before parsing. Just use the following method.
private static readonly Regex urlRegex = new Regex(@"(?<Protocol>\w+):\/\/(?<Domain>[\w@][\w.:@]+)\/?[\w\.?=%&=\-@/$,]*", RegexOptions.Compiled);
private static readonly Regex emailRegex = new Regex(@"([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})", RegexOptions.Compiled);
private static readonly IEnumerable<string> disallowedProtocols = new[] { "javascript", "ftp" };
private static string ConvertUrls(string s) {
s = emailRegex.Replace(
s,
match => string.Format(CultureInfo.InvariantCulture, "<a href=\"mailto:{0}\" rel=\"nofollow\">{0}</a>", match.Value)
);
s = urlRegex.Replace(
s,
match => {
var protocolGroup = match.Groups["Protocol"];
if (protocolGroup.Success && !disallowedProtocols.Contains(protocolGroup.Value, StringComparer.OrdinalIgnoreCase)) {
return string.Format(CultureInfo.InvariantCulture, "<a href=\"{0}\" rel=\"nofollow\">{0}</a>", match.Value);
} else {
return match.Value;
}
}
);
return s;
}
精彩评论