Library/Algorithm to minify URL
I want to display URLs within a limited area: 2 lines and width of ~120px. Obviously most URLs don't fit.
So I'm looking for an approach to 'minify' an URL in order to make it smaller yet still recognizable and distinguishable from others.
for example:
https://stackoverflow.com/questions/ask
http://www.cnn.com/2011/US/03/04/obama.miami.school/index.html
http://techcrunch.com/2011/03/04/founder-stories-foursquare-crowley-invent-future/
http://cran.r-project.org/web/packages/bcp/index.html
become
stackoverflow | ask
cnn | obama.miami.school
techcrunch | founder-stories-foursquare
cran.r-project.org | packages/bcp
So you see this is kind of a creative question. Computing could either be done on server (Java) or client (Javascript).
Any feedback very w开发者_运维百科elcome!
You can:
- strip common parts ("http://", "www", ".com", ".html" ...)
- strip numbers
- strip multiple continous special characters (not letters)
define abrevations for common long parts (foursquare -> 4sq)
check the pieces that are left against a database how common they are. Keep the ones uncommon and drop the common ones until the result is short enough.
I would be careful not remove to much information. Or create to much abbreviations.
You don't want
yourbank.com\login
yourbank.hackersite\login.php
both to look like:
yourbank | login
Or you are going to make it very easy for malicious people to abuse your system.
And even when you don't omit the top level domain part users can be easily confused, which malicious attackers might abuse. Perhaps highlighting the most important parts of the URL would be a improvement.
精彩评论