JavaScript rejects all my RegExs, how come?
I have this at the moment, (I found the code on here).
var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
someText.replace(exp, "<a href='$1'>$1</a>");
It will replace any http://URL开发者_运维问答 in someText with a proper <a href>
But i also require it to match www. without the http. I found this RegEx on RegEx Lib.
((http\://|https\://|ftp\://)|(www.))+(([a-zA-Z0-9\.-]+\.[a-zA-Z]{2,4})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(/[a-zA-Z0-9%:/-_\?\.'~]*)?
And i tested in on the RegEx checker site, http://www.nvcc.edu/home/drodgers/ceu/resources/test_regexp.asp
It matches the strings i want. But when i put it into my exp var, JavaScript is blowing up and causing an error.
I even tried newing it up as a new RegExp like so.
var exp = new RegExp(((http\://|https\://|ftp\://)|(www.))+(([a-zA-Z0-9\.-]+\.[a-zA-Z]{2,4})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(/[a-zA-Z0-9%:/-_\?\.'~]*)?);
But the same thing happens.
Any ideas what i am doing wrong?
Thanks, Kohan
I believe the RegExp constructor takes a string as argument, see here: https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Global_Objects/RegExp
So just put quotes around your regexp and it should work fine.
var exp = new RegExp("((http\\://|https\\://|ftp\\://)|(www.))+(([a-zA-Z0-9\\.-]+\\.[a-zA-Z]{2,4})|([0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}))(/[a-zA-Z0-9%:/-_\\?\\.'~]*)?");
someText.replace(exp, "<a href='$1'>$1</a>");
Okay, you've got the JavaScript syntax straightened out, now let's talk about regex syntax. The colon (:
) has no special meaning, so there's no need to escape it. The dot (.
) and question mark (?
) normally do have special meanings, but not when they appear in a character class (i.e., inside the square brackets).
The hyphen (-
) does have special meaning in a character class: it forms ranges, like [a-z]
and [0-9]
. If you want to include a literal hyphen in a character class, you either escape it with a backslash or place it at the beginning or end of the list. For example, in [a-zA-Z0-9\.-]
the final hyphen matches a hyphen, while the other three are used to form ranges. (The backslash in front of the dot is unnecessary, but it doesn't harm anything.)
Now look at [a-zA-Z0-9%:/-_\?\.'~]
. The backslashes in front of ?
and .
are just clutter, but that foruth hyphen is a real problem. It forms a range starting with /
and ending with _
; if you look at an ASCII character map, you'll see that it includes the digits 0-9
and uppercase letters A-Z
, plus
/
, :
, ;
, <
, =
, >
, ?
, @
, [
, \
, ]
, ^
, _
...obviously not what the author intended. There's also a lot of unnecessary grouping and duplicate code in that regex, and do you really need to match IP addresses, too? The moral is: don't trust anything you find on RegExLib.com.
Regular expression in javascript must be surrounded by slashes '/', so it will look like
var expr = /pattern/flags;
for you the corect way is
var exp = /((http\://|https\://|ftp\://)|(www.))+(([a-zA-Z0-9\.-]+\.[a-zA-Z]{2,4})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(/[a-zA-Z0-9%:/-_\?\.'~]*)?/;
If you use the constructor new RegExp(), call it in a form
var expr = new RegExp(pattern [, flags]);
here pattern and flags are string params
var exp = new RegExp("((http\://|https\://|ftp\://)|(www.))+(([a-zA-Z0-9\.-]+\.[a-zA-Z]{2,4})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(/[a-zA-Z0-9%:/-_\?\.'~]*)?");
精彩评论