开发者

Reading JSON Content

I'm using jsoup to scrape some HTML data and it's working out great. Now I need to pull some JSON content (only JSON, not HTML). Can I do this 开发者_运维问答easily with jsoup or do I have to do it using another method? The parsing that jsoup performs is encoding the JSON data so it's not parsing properly with Gson.


While great, Jsoup is a HTML parser, not a JSON parser, so it is useless in this context. If you ever attempt it, Jsoup will put the returned JSON implicitly in a <html><head> and so on. You don't want to have that. Gson is a JSON parser, so you definitely need it.

Your concrete problem is likely that you don't know how to feed an URL returning a JSON to Gson. In that case, you need to use URL#openStream() to get an InputStream of it and use InputStreamReader to decorate it into a Reader which finally can be fed to Gson#fromJson() which accepts a Reader.

InputStream input = new URL("http://example.com/foo.json").openStream();
Reader reader = new InputStreamReader(input, "UTF-8");
Data data = new Gson().fromJson(reader, Data.class);
// ...


Jsoup is not designed for parsing JSON. Use Gson(or any other java JSON library). For getting remote content with Jsoup use this:

Connection  con = HttpConnection.connect(url);
con.method(Method.POST).data(data.params).ignoreContentType(true);
Response resp = con.execute();
String body = resp.body();


Jsoup does not parse JSON, but it can be used to fetch JSON data easily.

package com.zetcode;

import com.google.gson.Gson;
import java.io.IOException;
import org.jsoup.Jsoup;

class TimeData {

    private String time;
    private Long milliseconds_since_epoch;
    private String date;

    @Override
    public String toString() {
        return "TimeData{" + "time=" + time + ", milliseconds_since_epoch=" 
                + milliseconds_since_epoch + ", date=" + date + '}';
    }
}


public class GsonReadWebPage {

    public static void main(String[] args) throws IOException {

        String webPage = "http://time.jsontest.com";

        String data = Jsoup.connect(webPage).ignoreContentType(true).execute().body();

        Gson gson = new Gson();
        TimeData td = gson.fromJson(data, TimeData.class);

        System.out.println(td);
    }
}

The example reads JSON data from http://time.jsontest.com with JSoup and parses JSON with Gson. To execute this example, you need Jsoup and Gson dependencies.


An old question but struggled a bit to figure this out. Jsoup can fetch the JSON data if you set ignoreContentType to true. However, it still wraps the JSON content in HTML tags like this.

<html>
 <head></head>
 <body>
{ JSON DATA }
 </body>
</html>

In order to remove this, we can simply get the body content as given below.

Connection connection = Jsoup.connect("URL").ignoreContentType(true);       
connection.execute();

String strJsonData = connection.get().body().text();


I don't know about jsoup, but if it's valid JSON, then Gson should be able to decode (may need some custom deserializers for your custom classes).

If it's not valid JSON and you are getting errors, then there is a bug in jsoup.


I have seen many answers and people writing pages worth of code, I have no idea why but you can do this easly with GSON.

/**
 * Convert json string to json object
 */
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;

private JsonObject getResAsJson(String response) {
    return new JsonParser().parse(response).getAsJsonObject();
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜