Regex remove some text a a hyper link
click <a href="javascript:validate('http://www.google.com');">here</a> to open google.com
I need to replace the above sentence to the following:
click <a href="http://www.google.com">here</a> to open google.com
开发者_如何学Python
Please help me with the regular expression to do this in C#
Regex regex = new Regex ("href\=\".+?'(.+)'",
RegexOptions.IgnoreCase);
MatchCollection matches = regex.Matches(text);
then youll need to extract Group #1 :
matches .Groups[1]
and this is your new value to assign.
Here you go:
The Regex:
(?<=href\=")(javascript:validate\('(?<URL>[^"']*)'\);)
The Code:
string url = "click <a href=\"javascript:validate('http://www.google.com');\">here</a> to open google.com";
Regex regex = new Regex("(?<=href\\=\")javascript:validate\\('(?<URL>[^\"']*)'\\);");
string output = regex.Replace(url, "${URL}");
The Output:
click <a href="http://www.google.com">here</a> to open google.com
No Regex needed:
var s =
inputString.Replace(
"javascript:validate('http://www.google.com');",
"http://www.google.com" );
HtmlAgilityPack: http://htmlagilitypack.codeplex.com
This is the preferred method for parsing HTML.
Parsing the HTML as Austin suggested is a much more efficient way of doing this, but if you absolutely must use REGEX try something like this (referenced from MSDN System.Text.RegularExpressions Namespace):
using System;
using System.Text.RegularExpressions;
class MyClass
{
static void Main(string[] args)
{
string pattern = @"<a href=\"[^\(]*\('([^']+)'\);\">";
Regex r = new Regex(pattern, RegexOptions.IgnoreCase);
string sInput = "click <a href=\"javascript:validate('http://www.google.com');\">here</a> to open google.com";
MyClass c = new MyClass();
// Assign the replace method to the MatchEvaluator delegate.
MatchEvaluator myEvaluator = new MatchEvaluator(c.ReplaceCC);
// Write out the original string.
Console.WriteLine(sInput);
// Replace matched characters using the delegate method.
sInput = r.Replace(sInput, myEvaluator);
// Write out the modified string.
Console.WriteLine(sInput);
}
// Replace each Regex cc match
public string ReplaceCC(Match m)
{
return "click <a href=\"" + m.Group[0] + "\">";
}
}
精彩评论