Reading JSON Content
I'm using jsoup to scrape some HTML data and it's working out great. Now I need to pull some JSON content (only JSON, not HTML). Can I do this 开发者_运维问答easily with jsoup or do I have to do it using another method? The parsing that jsoup performs is encoding the JSON data so it's not parsing properly with Gson.
While great, Jsoup is a HTML parser, not a JSON parser, so it is useless in this context. If you ever attempt it, Jsoup will put the returned JSON implicitly in a <html><head>
and so on. You don't want to have that. Gson is a JSON parser, so you definitely need it.
Your concrete problem is likely that you don't know how to feed an URL returning a JSON to Gson. In that case, you need to use URL#openStream()
to get an InputStream
of it and use InputStreamReader
to decorate it into a Reader
which finally can be fed to Gson#fromJson()
which accepts a Reader
.
InputStream input = new URL("http://example.com/foo.json").openStream();
Reader reader = new InputStreamReader(input, "UTF-8");
Data data = new Gson().fromJson(reader, Data.class);
// ...
Jsoup is not designed for parsing JSON. Use Gson(or any other java JSON library). For getting remote content with Jsoup use this:
Connection con = HttpConnection.connect(url);
con.method(Method.POST).data(data.params).ignoreContentType(true);
Response resp = con.execute();
String body = resp.body();
Jsoup does not parse JSON, but it can be used to fetch JSON data easily.
package com.zetcode;
import com.google.gson.Gson;
import java.io.IOException;
import org.jsoup.Jsoup;
class TimeData {
private String time;
private Long milliseconds_since_epoch;
private String date;
@Override
public String toString() {
return "TimeData{" + "time=" + time + ", milliseconds_since_epoch="
+ milliseconds_since_epoch + ", date=" + date + '}';
}
}
public class GsonReadWebPage {
public static void main(String[] args) throws IOException {
String webPage = "http://time.jsontest.com";
String data = Jsoup.connect(webPage).ignoreContentType(true).execute().body();
Gson gson = new Gson();
TimeData td = gson.fromJson(data, TimeData.class);
System.out.println(td);
}
}
The example reads JSON data from http://time.jsontest.com
with JSoup
and parses JSON with Gson. To execute this example, you need Jsoup
and Gson dependencies.
An old question but struggled a bit to figure this out. Jsoup can fetch the JSON data if you set ignoreContentType
to true
. However, it still wraps the JSON content in HTML tags like this.
<html>
<head></head>
<body>
{ JSON DATA }
</body>
</html>
In order to remove this, we can simply get the body content as given below.
Connection connection = Jsoup.connect("URL").ignoreContentType(true);
connection.execute();
String strJsonData = connection.get().body().text();
I don't know about jsoup, but if it's valid JSON, then Gson should be able to decode (may need some custom deserializers for your custom classes).
If it's not valid JSON and you are getting errors, then there is a bug in jsoup.
I have seen many answers and people writing pages worth of code, I have no idea why but you can do this easly with GSON.
/**
* Convert json string to json object
*/
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
private JsonObject getResAsJson(String response) {
return new JsonParser().parse(response).getAsJsonObject();
}
精彩评论