开发者

Scala or Java Library for fixing malformed URIs

Does anyone know of a good Scala or Java library that c开发者_Go百科an fix common problems in malformed URIs, such as containing characters that should be escaped but aren't?


I've tested a few libraries, including the now legacy URIUtil of HTTPClient without feeling I found any viable solution. Typically, I've had enough success with this type of java.net.URI construct though:

/**
 * Tries to construct an url by breaking it up into its smallest elements
 * and encode each component individually using the full URI constructor:
 *
 *    foo://example.com:8042/over/there?name=ferret#nose
 *    \_/   \______________/\_________/ \_________/ \__/
 *     |           |            |            |        |
 *  scheme     authority       path        query   fragment
 */
public URI parseUrl(String s) throws Exception {
   URL u = new URL(s);
   return new URI(
        u.getProtocol(), 
        u.getAuthority(), 
        u.getPath(),
        u.getQuery(), 
        u.getRef());
}

which may be used combination with the following routine. It repeatedly decodes an URL until the decoded string doesn't change, which can be useful against e.g., double encoding. Note, to keep it simple, this sample doesn't feature any failsafe etc.

public String urlDecode(String url, String encoding) throws UnsupportedEncodingException, IllegalArgumentException {
    String result = URLDecoder.decode(url, encoding);
    return result.equals(url) ? result : urlDecode(result, encoding);
}


I would advise against using java.net.URLEncoder for percent encoding URIs. Despite the name, it is not great for encoding URLs as it does not follow the rfc3986 standard and instead encodes to the application/x-www-form-urlencoded MIME format (read more here)

For encoding URIs in Scala I would recommend the Uri class from spray-http. scala-uri is an alternative (disclaimer: I'm the author).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜