Can I use variables in pattern in Regex (C#)
I have some HTML-text, where I need to replace words to links on them. For example, I have text with word "PHP", and want to replace it with <a href="glossary.html#php">PHP</a>. And there are many words that I need to replace.
My code:
public struct GlossaryReplace
{
public string word; // here the words, e.g. PHP
public string link; // here the links to replace, e.g. glossary.html#php
}
public static GlossaryReplace[] Replaces = null;
IHTMLDocument2 html_doc = webBrowser1.Document.DomDocument as IHTMLDocument2;
string html_content = html_doc.body.outerHTML;
for (int i = 0; i < Replaces.Length; i++)
{
String substitution = "<a class=\"glossary\" href=\"" + Replaces[i].link + "\">" + Replaces[i].word + "</a>";
html_content = Regex.Replace(html_content, @"\b" + Replaces[i].word + "\b", substitution);
}
html_doc.bod开发者_C百科y.innerHTML = html_content;
The trouble is - this is not working :( But,
html_content = Regex.Replace(html_content, @"\bPHP\b", "some replacement");
this code works well! I can't understand my error!
The @ prefix for strings only apply to the immediately following string, so when you concatenate strings you may have to use it on each string.
Change this:
html_content = Regex.Replace(html_content, @"\b" + Replaces[i].word + "\b", substitution);
to:
html_content = Regex.Replace(html_content, @"\b" + Replaces[i].word + @"\b", substitution);
In a regular expression \b
means a word boundary, but in a string it means a backspace character (ASCII 8). You get a compiler error if you use an escape code that doesn't exist in a string (e.g. \s
), but not in this case as the code exist both in strings and regular expressions.
On a side note; a method that is useful when creating regular expression patterns dynamically is the Regex.Escape
method. It escapes characters in a string to be used in a pattern, so @"\b" + Regex.Escape(Replaces[i].word) + @"\b"
would make the pattern work even if the word contains characters that have a special meaning in a regular expression.
You forgot a @
here:
@"\b" + Replaces[i].word + "\b"
Should be:
@"\b" + Replaces[i].word + @"\b"
I'd also recommend that you use an HTML parser if you are modifying HTML. HTML Agility Pack is a useful library for this purpose.
精彩评论