Java URI.resolve
I'm trying to resolve two URIs, but it's not as straightforward as I'd like it to be.
开发者_开发问答URI a = new URI("http://www.foo.com");
URI b = new URI("bar.html");
The trouble is that a.resolve(b).toString()
is now "http://www.foo.combar.html"
. How can I get away with that?
Sounds like you probably want to use URL rather than URI (which is more general and needs to deal with a less strict syntax.)
URI a = new URI("http://www.foo.com");
URI b = new URI("bar.html");
URI c = a.resolve(b);
c.toString() -> "http://www.foo.combar.html"
c.getAuthority() -> "www.foo.com"
c.getPath() -> "bar.html"
URI's toString() doesn't behave as you might expect, but given its general nature it may be that it should be forgiven.
Sadly URI's toURL() method doesn't behave quite as I would have hoped to give you what you want.
URL u = c.toURL();
u.toString() -> "http://www.foo.combar.html"
u.getAuthority() -> "www.foo.combar.html" --- Oh dear :(
So best just to start straight out with a URL to get what you want:
URL x = new URL("http://www.foo.com");
URL y = new URL(x, "bar.html");
y.toString() -> "http://www.foo.com/bar.html"
URI should contain the final separator('/') as well to resolve the way you want:
URI a = new URI("http://www.foo.com/");
URI.resolve behaves like if you are on a HTML page like http://example.org/path/to/menu.html
and click a link with href="page1.html"
: It cuts off the last segment (here menu.html
) and puts page1.html
in its place.
(http://example.org/path/to/menu.html
, page1.html
) → http://example.org/path/to/page1.html
This works also, if the object you call resolve on is a directory, denoted by ending in a slash:
(http://example.org/path/to/
, page1.html
) → http://example.org/path/to/page1.html
If it does not end in a slash, the outcome is not what you might expect:
(http://example.org/path/to
, page1.html
) → http://example.org/path/page1.html
(missing "to")
If you know that the first argument of the URIs to concatenate is a directory, but you don’t know in which format you get it (with or without trailing slash), this might help you:
static URI asDirectory(URI uri) {
String uriString = uri.toString();
return !uriString.endsWith("/") ? URI.create(uriString.concat("/")) : uri;
}
Ok, appears from URL deffinition scheme://domain:port/path?query_string#fragment_id
there should be 3 slashes before path (two after scheme and one directly before path)
2 situation can occure:
- there are 3 slashes in your URI => everything is OK
- there are less then 3 slashes in your URI => you need to add slash at the end of URI
there is my snappet of code:
String url = "http://www.foo.com";
String endSlash="";
int indexOfSlash = 0;
for(int i = 0;i<3;i++){
int nextIndex = url.indexOf('/',indexOfSlash);
if(!(nextIndex>0)){
if(i>1){
endSlash="/";
}else{
throw new MalformedURLException("Bad given url format, mising :// after schema");
}
}else{
indexOfSlash = ++nextIndex;
}
}
URL rightUrl = new URL(url+endSlash);
Example with the different posibilities:
URI uri = new URI( "http://www.example.org");
System.out.println( "*** "+uri+" ***" );
System.out.println( uri.resolve( "bar") );
System.out.println( uri.resolve( "/bar") );
System.out.println( uri.resolve( "bar/") );
System.out.println( uri.resolve( "/bar/") );
System.out.println();
uri = new URI( "http://www.example.org/");
System.out.println( "*** "+uri+" ***" );
System.out.println( uri.resolve( "bar") );
System.out.println( uri.resolve( "/bar") );
System.out.println( uri.resolve( "bar/") );
System.out.println( uri.resolve( "/bar/") );
System.out.println();
uri = new URI( "http://www.example.org/foo1/foo2");
System.out.println( "*** "+uri+" ***" );
System.out.println( uri.resolve( "bar") );
System.out.println( uri.resolve( "/bar") );
System.out.println( uri.resolve( "bar/") );
System.out.println( uri.resolve( "/bar/") );
System.out.println();
uri = new URI( "http://www.example.org/foo1/foo2/");
System.out.println( "*** "+uri+" ***" );
System.out.println( uri.resolve( "bar") );
System.out.println( uri.resolve( "/bar") );
System.out.println( uri.resolve( "bar/") );
System.out.println( uri.resolve( "/bar/") );
produces as output:
*** http://www.example.org ***
http://www.example.orgbar
http://www.example.org/bar
http://www.example.orgbar/
http://www.example.org/bar/
*** http://www.example.org/ ***
http://www.example.org/bar
http://www.example.org/bar
http://www.example.org/bar/
http://www.example.org/bar/
*** http://www.example.org/foo1/foo2 ***
http://www.example.org/foo1/bar
http://www.example.org/bar
http://www.example.org/foo1/bar/
http://www.example.org/bar/
*** http://www.example.org/foo1/foo2/ ***
http://www.example.org/foo1/foo2/bar
http://www.example.org/bar
http://www.example.org/foo1/foo2/bar/
http://www.example.org/bar/
In conclusion:
- cases 1-4: If the original URI has not path, resolve uses it argument as new path. That can produce a wrong domain if the resolve argument doesn't start by slash.
- cases 5-8: If the original URI is a domain plus slash, resolve uses its argument as new path. No problem with double slashes.
- cases 10, 12, 14 and 16: If the original URI has a path and resolve argument starts by slash, resolve replaces URI path by its argument. Initial path is totally discarded.
- cases 9, 11: If the original URI has a path not ended by slash and resolve argument doesn't start by slash, resolve discards the last element of the initial uri path and adds its argument.
- cases 13, 15: If the original URI has a path ended by slash and resolve argument doesn't start by slash, resolve adds its argument.
精彩评论