Javascript regex to find a base URL
I'm going mad with this regex in JS:
var patt1=/^http(s)?:\/\/[a-z0-9-]+(.[a-z0-9-]+)*?(:[0-9]+)?(\/)?$/i;
If I give an input string like "http://www.eitb.com/servicios/concursos/516522/" this regex it's supossed to return NULL, because there are a "folder" after base URL. It works in PHP, but not in Javascript, like in this script:
<script type="text/javascript">
var str="http://www.eitb.com/servicios/concursos/516522/";
var patt1=/^http(s)?:\/\/[a-z0-9-]+(.[a-z0-9-]+)*?(:[0-9]+)?(\/)?$/i;
document.write(str.match(patt1));
</script>
It returns
http://www.eitb.com/servicios/concursos/516522/,,/516522,,/
The question is: why it is not working? How to make it work?
The idea is to implement this regex in another function to get NULL when the URL passed is not in the correct format:
http://www.eitb.com/ -> Co开发者_如何学Crrect http://www.eitb.com/something -> Incorrect
Thanks
I'm no javascript pro, but accustomed to perl regexp, so I'll give it a try; the .
in the middle of the regexp might need to be escaped, as it can map a /
and jinx the whole regexp.
Try this way:
var patt1=/^http(s)?:\/\/[a-z0-9-]+(\.[a-z0-9-]+)*?(:[0-9]+)?(\/)?$/i;
Considering you have a properly formatted URL this simple RegExp should do the trick every time.
var patt1=/^https?:\/\/[^\/]+/i;
Here's the breakdown...
Starting with the first position (denoted by ^)
Look for http
http can be followed by s (denoted by the ? which means 0 or 1 of the character or set before it)
Then look for :// after the http or https (denoted by :\/\/)
Next match any number of characters except for / (denoted by [^\/]+ - the + means 1 or more)
Case insensitive (denoted by i)
NOTE: this will also pick up ports http://example.com:80 - to get rid of the :80 (or a colon followed by any port number) simply add a : to the negated character class [^\/:] for example.
精彩评论