Verify random URLs on a network in Java
This question may be a bit too low-level, but I couldn't find an answer already.
I'm typing this next paragraph so that you can correct me/ explain the things I refer to unwittingly. You know in a web browser you can type directory paths from your own computer, and it will bring them up? Apparently, it also works with pages within a local network. If there's another page on the same subnet, you can access it with "http://pagename/".
On the network I'm a part of, there are a lot of these pages, and they all (or mostly) have common, single-word names, such as "http://word/" . I want to test, using Java, a dictionary of common words to see which exist as locations on the network. Of course,开发者_开发技巧 there's probably an easier way if I know the range of ip addresses on the network, which I do. However, I get the "page not found" page if I try typing the IP address of, say, "http://word/" (which I get from ping), into the address bar. This is true even if "http://word/" works.
So say I loop through my word bank. How can I test if a URL is real? I've worked out how to load my word bank. Here's what I have right now
URL article=new URL("http://word"); //sample URL
URLConnection myConn=article.openConnection();
Scanner myScan=new Scanner(new InputStreamReader(myConn.getInputStream()));
System.out.println(myScan.hasNext()); //Diagnostic output
This works when the URL is constructed with a valid URL. When it gets passed a bad URL, the program just ignores the System.out.println
, not even making a new line. I know that different browsers show different "page not found" screens, and that these have their own html source code. Maybe that's related to my problem?
How can I test if a URL is real using this method? Is there a way to test it with IP addresses, given my problem? or, why am I having a problem typing in the IP address and not the URL?
You should check HTTP response code. If URL is "real" (in your terms) the response code should be 200. Otherwise I believe that you will get other response code.
Do it using HttpUrlConnection.getResponseCode();
HttpUrlConnection is a subclass of URLConnection. When your are connecting with HTTP that is actually what you get from openConnection()
, so you can say:
URL article=new URL("http://word"); //sample URL HttpURLConnection myConn = (HttpURLConnection)article.openConnection();
If you are testing only http urls you can cast the URLConnection to a HTTPUrlConnection and check the HTTP response code for 200 = HTTP_OK:
URL article=new URL("http://word"); //sample URL
HttpURLConnection myConn= (HttpURLConnection)article.openConnection();
if (myConn.getResponseCode() == HttpURLConnection.HTTP_OK) {
// Site exists and has valid content
}
Additionally if you want to test IP addresses you van simply use it as url: http://10.0.0.1
I think I've figured it out.
This code wouldn't compile without me catching IOException (Because of URL, URLConnection, and Scanner), so I had to try{/*code*/} catch(IOException oops){}
, which I did nothing with. I didn't think that it was important to put the try/catch in my question. UnknownHostException and MalformedURLException extend IOException, so I was already unwittingly triggering one of them with Scanner.hasNext()
or with HttpURLConnection.getResponseCode()
, catching it, and exiting the try block. Thus, I never got a response code when I had a bad URL. So I need to write
try
{
URL article=new URL("http://word");
HttpURLConnection myConn=(HttpURLConnection)article.openConnection();
//code to store "http://word" as a working URL
}
catch (UnknownHostException ex) {/*code if "http://word" is not a working URL*/}
catch (IOException oops) {oops.printStackTrace();}
Thanks for everyone's help, I learned a lot. If you have a different/better answer or if you can answer why using the IP addresses didn't work, I'm still wondering that.
精彩评论