开发者

Shorten a String

Is there a better way to shorten (Use fewer characters) a String in开发者_JS百科 java besides converting the chars to int's and running them through base36?

For example, say if I wanted to shorten a URL.


Short URL services (like 'tinyurl') work by storing a big database table that maps from short URLs to their full form.

When you request a tinyurl, the service allocates a random-looking short url (that is not currently in use) and creates an entry in its table that maps from the short url to your supplied longer one.

When you try to load the short url in a browser, the request first goes to the tinyURL service, which looks up the full URL and then sends an HTTP redirect response to the browser telling it to go to the real URL.

You can implement your own URL shortening service by doing the same thing, though if you are shortening your own URLs you can maybe do the redirection internally to your web server; e.g. using a servlet request filter.


I described the above in the context of shortening URLs in a way that still allows the URLs to be resolved1. But, this approach can also be used more generally; i.e. by creating a pair of Map<String,String> objects and populate it with bidirectional mappings between sequentially generated short strings and the original (probably longer) strings. It is possible to prove that will give a smaller average size of short string than any algorithmic compression or encoding scheme over the same set of long strings.

The downside is the space needed to store the mappings, and the fact that you need the mappings any place (e.g. on any computer) where you need to do the short-to-long or long-to-short conversions.

1 - When you think about it, that is essential. If you shorten a URL string and the result is no longer resolvable, it not a useful URL for most purposes.


Since URL's are UTF-8, and since the characters are therefore base 256, encoding the same characters as integer code-points in base 32 can only make them longer. Or are you not asking what it sounds like you are asking?

Further, in Java Strings are base 65536 UTF-16, so encoding their code points as base 32 will make Java strings even longer.

Just as encoding binary data in base 64 makes it longer by 4/3's - every 3 bytes requires 4 base 64 bytes to encode.


Put the full Urls in a database and give the id as the redirect URL

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜