开发者

JavaScript rejects all my RegExs, how come?

I have this at the moment, (I found the code on here).

     var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
     someText.replace(exp, "<a href='$1'>$1</a>");  

It will replace any http://URL开发者_运维问答 in someText with a proper <a href>

But i also require it to match www. without the http. I found this RegEx on RegEx Lib.

((http\://|https\://|ftp\://)|(www.))+(([a-zA-Z0-9\.-]+\.[a-zA-Z]{2,4})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(/[a-zA-Z0-9%:/-_\?\.'~]*)?

And i tested in on the RegEx checker site, http://www.nvcc.edu/home/drodgers/ceu/resources/test_regexp.asp

It matches the strings i want. But when i put it into my exp var, JavaScript is blowing up and causing an error.

I even tried newing it up as a new RegExp like so.

var exp = new RegExp(((http\://|https\://|ftp\://)|(www.))+(([a-zA-Z0-9\.-]+\.[a-zA-Z]{2,4})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(/[a-zA-Z0-9%:/-_\?\.'~]*)?);

But the same thing happens.

Any ideas what i am doing wrong?

Thanks, Kohan


I believe the RegExp constructor takes a string as argument, see here: https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Global_Objects/RegExp

So just put quotes around your regexp and it should work fine.

var exp = new RegExp("((http\\://|https\\://|ftp\\://)|(www.))+(([a-zA-Z0-9\\.-]+\\.[a-zA-Z]{2,4})|([0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}))(/[a-zA-Z0-9%:/-_\\?\\.'~]*)?");
someText.replace(exp, "<a href='$1'>$1</a>");


Okay, you've got the JavaScript syntax straightened out, now let's talk about regex syntax. The colon (:) has no special meaning, so there's no need to escape it. The dot (.) and question mark (?) normally do have special meanings, but not when they appear in a character class (i.e., inside the square brackets).

The hyphen (-) does have special meaning in a character class: it forms ranges, like [a-z] and [0-9]. If you want to include a literal hyphen in a character class, you either escape it with a backslash or place it at the beginning or end of the list. For example, in [a-zA-Z0-9\.-] the final hyphen matches a hyphen, while the other three are used to form ranges. (The backslash in front of the dot is unnecessary, but it doesn't harm anything.)

Now look at [a-zA-Z0-9%:/-_\?\.'~]. The backslashes in front of ? and . are just clutter, but that foruth hyphen is a real problem. It forms a range starting with / and ending with _; if you look at an ASCII character map, you'll see that it includes the digits 0-9 and uppercase letters A-Z, plus

/, :, ;, <, =, >, ?, @, [, \, ], ^, _

...obviously not what the author intended. There's also a lot of unnecessary grouping and duplicate code in that regex, and do you really need to match IP addresses, too? The moral is: don't trust anything you find on RegExLib.com.


Regular expression in javascript must be surrounded by slashes '/', so it will look like

var expr = /pattern/flags;

for you the corect way is

var exp = /((http\://|https\://|ftp\://)|(www.))+(([a-zA-Z0-9\.-]+\.[a-zA-Z]{2,4})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(/[a-zA-Z0-9%:/-_\?\.'~]*)?/;

If you use the constructor new RegExp(), call it in a form

var expr = new RegExp(pattern [, flags]);

here pattern and flags are string params

var exp = new RegExp("((http\://|https\://|ftp\://)|(www.))+(([a-zA-Z0-9\.-]+\.[a-zA-Z]{2,4})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(/[a-zA-Z0-9%:/-_\?\.'~]*)?");
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜