开发者

Why is 'http://dd ' a valid URL?

I'm writing a .NET 3.5 app and using URI.IsWellFormedUriString(string uriString, UriKind uriKind) to verify user-inputted URIs; using UriKind.Absolute. I was just playing with the application and I'm a bit worried and confused as to why something like:

http://ddd

is a valid URI? What gives? I know it's because it's part of the RFC, but why is it valid in the first place?

The only time I've ever seen URIs like that is for corporate, internal Intranets like

http:/开发者_如何学运维/companyinet

or

http://localhost (which is very popular, but also a special case)

I do not want to have to use my own regular expression as there are so many varying URI regexs. However, I do not really want users entering URIs like that that aren't publically accessible.

Any idea or thoughts? Thanks.


is valid because it does point to a Unique Resource. In this case, it points to the webserver (hopefully) of the computer 'ddd' on the local network.

URI is unique resource identifier, not unique world wide web resource identifier. file:///blah.txt is also a valid URI


That's because it IS a perfectly valid URI, as you mention.

I'd alter your strategy slightly... If you want URIs that are not only valid (as in well-formed), but also valid, in the sense that they actually point to a site, you'll have to add one more step.

After the string validation, issue a HEAD request to ping the URL. If it returns a 2xy status code, you're probably good to go. This will work in most situations, but is not without caveats and exceptions.


Because it conforms to RFC 1738 (as well as the URI specification of RFC 2396).

The RFC makes specific allowances for resource paths that only consist of a scheme and a scheme specific element - in this case a hostname. As long as it identifies a unique resources and conforms to the syntax of URIs it is valid.


You answered the question yourself. It's a "valid" (well-formed) URI by the RFC spec's definition ipso facto.

To help solve your required task, do some addition checks in your regex for the one or more dots (don't forget to escape them!) or possibly try to hit the resource itself to see if it actually responds.


It is a valid URI because it follows the syntax of URIs: it has a scheme, and a scheme-specific component ('http' being the scheme', ':' separating the two, and '//ddd' is the scheme-specific part.)

In the case of a HTTP URI, it also follows the syntax for those, with 'ddd' being a valid host name.

The syntax of URIs is defined in http://www.ietf.org/rfc/rfc2396.txt


Here is a simple experiment to see why that URL is valid:

0) use the dig or ping utility to get the IP address of google.com. I got: 74.125.53.100

1) Edit your /etc/hosts file (on Windows it is something like C:\Windows\system32\drivers\etc\hosts, and you might need to create it). In your hosts file, add a line like this:

74.125.53.100 ddd

Don't forget to save your edits.

2) In a web browser, go to this URL: http://ddd

3) You just accessed Google using the URL. That's why it's a valid URL.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜