Regex in GWT to match URLs
I implemented the Pattern class as shown here: http://www.java2s.com/Code/Java/GWT/ImplementjavautilregexPatternwithJavascriptRegExpobject.htm
And I would like to use the following regex to match urls in my开发者_运维问答 String:
(http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?
Unfortunately, the Java compiler of course fails on parsing that string because it doesn't use valid escape sequences (since the above is technically a url pattern for JavaScript, not Java)
At the end of the day, I'm looking for a regex pattern that will both compile in Java and execute in JavaScript correctly.
You will have to use JSNI to do the regex evaluation part in Javascript. If you do write the regex with the escaped backslashes, that will get converted to Javascript as it is and will obviously be invalid. Thought it will work in the Hosted or Dev mode as thats still running Java bytecode, but not on the compiled application.
A simple JSNI example to test if a given string is a valid URL:
// Java method
public native boolean isValidUrl(String url) /*-{
var pattern = /(http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/;
return pattern.test(url);
}-*/;
There may be other irregularities between the Java and Javascript regex engines, so it's better to offload it completely to Javascript at least for moderately complex regexes.
The pattern itself looks fine, but I guess, its because of Backslash escaping.
Please take a look this http://www.regular-expressions.info/java.html
In literal Java strings the backslash is an escape character. The literal string "\\" is a single backslash. In regular expressions, the backslash is also an escape character. The regular expression \\ matches a single backslash. This regular expression as a Java string, becomes "\\\\". That's right: 4 backslashes to match a single one.
So, if you reuse your Javascript regex in java, you need to replace \
to \\
, and vice versa.
I don't know exactly how this would help but here is the exact function you requested in Javascript. I guess using JSNI like Anurag said will help.
var urlPattern = "(https?|ftp)://(www\\.)?(((([a-zA-Z0-9.-]+\\.){1,}[a-zA-Z]{2,4}|localhost))|((\\d{1,3}\\.){3}(\\d{1,3})))(:(\\d+))?(/([a-zA-Z0-9-._~!$&'()*+,;=:@/]|%[0-9A-F]{2})*)?(\\?([a-zA-Z0-9-._~!$&'()*+,;=:/?@]|%[0-9A-F]{2})*)?(#([a-zA-Z0-9._-]|%[0-9A-F]{2})*)?";
function isValidURL(url) {
urlPattern = "^" + urlPattern + "$";
var regex = new RegExp(urlPattern);
return regex.test(url);
}
Like what @S.Mark said, I basically took the "java" way of doing Regular Expression in Javascript.
In Java, you would just done it the following way (see how the expression is the same).
String urlPattern = "(https?|ftp)://(www\\.)?(((([a-zA-Z0-9.-]+\\.){1,}[a-zA-Z]{2,4}|localhost))|((\\d{1,3}\\.){3}(\\d{1,3})))(:(\\d+))?(/([a-zA-Z0-9-._~!$&'()*+,;=:@/]|%[0-9A-F]{2})*)?(\\?([a-zA-Z0-9-._~!$&'()*+,;=:/?@]|%[0-9A-F]{2})*)?(#([a-zA-Z0-9._-]|%[0-9A-F]{2})*)?";
Hope this helps. PS, this Regular expression works and even validates sites pointing to localhost:port) where port is any digit port number.
精彩评论