开发者

Using a "does not end with" regex for replacement purposes: how to avoid replacing the last character?

I'm using the following regular expression

<a href="[^/]

to find all links which do not start with a slash. I want to use the result of this regex to replace all <a href="somelink.html"> tags with something like <a href="http://mysite.com/somelink.html">.

But the problem with my regular expression is that (in the above example) the string <a href="s gets replaced instead of <a href=".

How can I fix this regular expression to avoid including the last charact开发者_运维技巧er in my match?

I'm using the .Net Regex library for this. Currently with the following code:

content = Regex.Replace(content, "(<a href=\")[^/]", "<a href=\"http://mysite.com/");

Maybe I should change something there? But I'd rather have a good regular expression if possible instead of starting to play around with SubString etc.


Don't use regex to parse HTML. Use HTML Agility Pack. It will make your life easier.

If you insist on using regex, try a negative lookahead:

<a href="(?!/)


If you have to use a regex, find a reference in the manual to look-ahead assertion, or equivalent. In Perl it is (?=pattern), so your patter becomes

  <a href="(?=[^/])

It will match if the pattern is followed by [^/], without including it in the match.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜