开发者

Regular Expression to match <a> tags without http://

how to match html "a" tags, o开发者_开发百科nly the ones without http, using regular expression?

ie match:

blahblah... < a href=\"somthing\" > ...blahblah

but not

blahblah... < a href=\"http://someting\" > ...blahblah


It's more easy to use a DOMParser and XPath, not a regex.

See my response in jsfiddle.

HTML

<body>
    <div>
        <a href='index.php'>1. index</a>
        <a href='http://www.bar.com'>2. bar</a>
        <a href='http://www.foo.com'>3. foo</a>        
        <a href='hello.php'>4. hello</a>        
    </div>
</body>

JS

$(document).ready(function() {
    var type = XPathResult.ANY_TYPE;
    var page = $("body").html();
    var doc = DOMParser().parseFromString(page, "text/xml");
    var xpath = "//a[not(starts-with(@href,'http://'))]";
    var result = doc.evaluate(xpath, doc, null, type, null);

    var node = result.iterateNext();
    while (node) {
        console.log(node); // returns links 1 and 4
        node  = result.iterateNext();        
    }

});

NOTES

  1. I'm using jquery to have a small code, but you can do it without jquery.
  2. This code must be adapted to work with ie (I've tested in firefox).


You should use a XML parser instead of regexes.


On the same topic :

  • RegEx match open tags except XHTML self-contained tags


With jquery, You can do something very simple:

links_that_doesnt_start_with_http = $("a:not([href^=http://])")

edit: Added the ://


I'm interpreting your question in that you mean any (mostly) absolute URI with a protocol, and not just HTTP. To add to everyone else's incorrect solutions. You should be doing this check on the href:

if (href.slice(0, 2) !== "//" && !/^[\w-]+:\/\//.test(href)) {
    // href is a relative URI without http://
}


var html = 'Some text with a <a href="http://example.com/">link</a> and an <a href="#anchor">anchor</a>.';
var re = /<a href="(?!http:\/\/)[^"]*">/i;
var match = html.match(re);
// match contains <a href="#anchor">

Note: this won't work if you've additional attributes.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜