开发者

REGEX: adding links in an HTML text

I have a puzzle that requires your help : I need to replace certain words with links in an HTML Text.

For example, I have to replace "word" with "<a href="...">word</ a>"

The difficulty is double :

  • 1. not to add links in tag attributes
  • 2. not to add links other links (nested links).

I found a solution to meet the case (1) but I can not handle the case (2).

Here is my simplified code:

String text="sample text <a>sample text</a> sample <a href='http://www.sample.com'>a good sample</a>";
String wordToReplace="sample";
String pattern="\\b"+wordToReplace+"\\b(?![^<>]*+>)"; //the last part is here to solve de problem (1)
String link="["+wordToReplace+"]"; //for more clarity, the generated link is replaced by [...]

System.out.println(text.replaceAll(pattern,link));
开发者_开发百科

The result is:

[sample] text <a>[sample] text</a> [sample] <a href='http://www.sample.com'>a good [sample]</a>

Problem : there is a link in a another link.

Do you have an idea how to solve this problem ?

Thank you in advance


Parsing HTML with regex is always a bad idea, precisely because of odd cases such as this. It would be better to use an HTML parser. Java has a built-in HTML Parser with using Swing that you might want to look into.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜