Invalid characters for http URL
Can anyone tell me what are the invalid characters for HTTP URL and the best way to validate the same in Java. What I am looking for is URLString validation in t开发者_C百科he URL format: http(s)://ip:port/URLString
Thanks in advance.
You can use any unicode characters you want, as long as they are % encoded. The explicitly reserved characters are defined in section 2.2 of RFC3986: https://www.rfc-editor.org/rfc/rfc3986#section-2
From the document:
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
According to RFC 1738 the following are deemed unsafe:
- < and > - delimiters around URLs in free text
- " (double quote) - delimits URLs in some systems
- delimits a URL from a fragment/anchor identifier that might follow it
- % - used to indicate character encodings
General unsafe characters: { } | \ ^ ~ [ ] `
Edit:
Not a duplicate, but includes some thoughts on validation in Java: Validate URL in java
Read RFC1738 Page 2 and Page 3 on the link for details.
How about using UrlValidator ? isValidPath method probably useful. :)
精彩评论