开发者

Weird problem accessing web page with Java

I am trying to write a program that reads the html source code of the website http://judgephilosophies.wikispaces.com. I wrote some simple java code that reads and outputs the sour开发者_JAVA百科ce code, but it just prints out "null." Here's the bizarre thing, though - if I replace "http://judgephilosophies.wikispaces.com" in the code with any other website, it works just fine. It only seems to be for websites in the wikispaces.com domain that the program doesn't work, and I am utterly befuddled as to why. The code is below. Help is much appreciated.

import java.io.*;
import java.net.*;

public class AccessWebExample 
{
    public static void main (String[] args) throws Exception
    {
        //Create reader to access html source code
        URL url = new URL ("http://judgephilosophies.wikispaces.com/");
        InputStreamReader isr = new InputStreamReader (url.openStream());
        BufferedReader reader = new BufferedReader (isr);

        //Read and print the text
        do
        { 
            System.out.println(reader.readLine());
        }
        while(reader.readLine() != null);
    }
}


Do an HTTP trace using Wireshark or somesuch and compare. It's probably a matter of cookies or headers, if the bare URLConnection is acting differently than a browser.


Using wget from the command line you'll find:

broach@broach-laptop:~$ wget http://judgephilosophies.wikispaces.com/
--2011-04-23 14:50:31--  http://judgephilosophies.wikispaces.com/
Resolving judgephilosophies.wikispaces.com... 208.43.192.33, 75.126.104.177
Connecting to judgephilosophies.wikispaces.com|208.43.192.33|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://session.wikispaces.com/1/auth/auth?authToken=e8ad55c0e2701a0e7da89807255609da [following]

It redirects (a couple more times, actually). Your bare URLConnection doesn't handle that. The response code is in the headers so your program currently prints null.

You really should look at using HttpUrlConnection as it can handle redirects for you. To do it with URL would require you looking at the returned headers and acting on HTTP response codes (which is what HttpURLConnection does)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜