开发者

Regex to match anchor with # in href for .NET

I'm trying to match and replace anchor tags using a regex. What i have so far is this:

"(<a href=['\"]?([\\w_\\.]*)['\"]?)"

The problem with this approach is that it fails to capture hrefs that also have # in their value. I've tried

"(<a开发者_JAVA百科 href=['\"]?([\\w_\\.#]*)['\"]?)"

and

"(<a href=['\"]?([\\w_\\.\\#]*)['\"]?)"

with no success.

What am i doing wrong?

Thank you


I don't think the problem is with # (works fine for me) but with missing other url characters, such as -, /, : etc.

How about a regex like this:

<a href=("[^"]+"|'[^']+'|[^ >]+)

Note: If possible, use other parsing DOM methods for valid html.


If you just want to replace the anchor part use string operations. They are simpler and faster

var parts = "http://someurl.com#hashpart".Split("#");
// yields "http://someurl.com" and "hashpart" as array.
// you may want to check if the result has length of two
// if it does :
var newUrl = string.Format("{0}#{1}" parts[0], "some replacement for hashpart");

If your URL contains multiple hashes try using string.Substring to split at the first hashtag.

var url = "http://someurl.com#hash#hashhash";
var hashPos = url.IndexOf("#");
var urlPart = url.Substring(hashPos);
var hashPart = url.Substring(hashPos +1, url.length - hashPos -1);

Should work, wrote it without verification, maybe you have to toss around some +/- 1 to get the right positions.


<a href=(('|")[^\2]+?\2|[^>]+)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜