开发者

Java : replacing all URLs with anchor tags that aren't already in anchor tags

I'm trying to replace all non-anchor-tag-enclosed URLs within anchor-tag-enclosed URLs for a document. So given the string:

I have two urls for google: <a href="http://www.google.com/">google</a> and http://www.google.com/

I would like to rep开发者_开发百科lace it with this:

I have two urls for google: <a href="http://www.google.com/">google</a> and <a href="http://www.google.com/">http://www.google.com/</a>

Does anyone know a clean way to do this in Java?


This might get you started (it works for the given example):

public class test {
    public static void main(String[] args) {
        final String test = "I have two urls for google: <a href=\"http://www.google.com/\">google</a> and http://www.google.com/";
        System.out.println(test.replaceAll("(?<!\\<a\\ href=\")http:\\/\\/[^ ]*",
                                           "<a href=\"$0\"/>"));
    }
}

There are some problems with it:

  • It doesn't account for whitespace in "a" tags, except for a single whitespace between the opening "a" and "href"
  • It assumes a URL is "http://" followed by a zero or more characters not equal to space (" ")

This will work for simple examples, I'm not sure how you'd write a complete solution.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜