How to parse content with <pre>?

2023-04-07 09:19 问答作者：

I am using jsoup to parse a number of things.

I am trying to parse this tag

开发者_如何学Python<pre>HEllo Worl<pre>

But just cant get it to work.

How would i parse this using jsoup?\

    Document jsDoc = null;
     jsDoc = Jsoup.connect(url).get();
 Elements titleElements = jsDoc.getElementsByTag("pre");

Here is what i have so far.

Works fine for me with latest Jsoup:

String html = "<p>lorem ipsum</p><pre>Hello World</pre><p>dolor sit amet</p>";
Document document = Jsoup.parse(html);
Elements pres = document.select("pre");

for (Element pre : pres) {
    System.out.println(pre.text());
}

Result:

Hello World

If you get nothing, then the HTML which you're parsing simply doesn't contain any <pre> element. Check it yourself by

System.out.println(document.html());

Perhaps the URL is wrong. Perhaps there's some JavaScript which alters the HTML DOM with new elements (Jsoup doesn't interpret nor execute JS). Perhaps the site expects a real browser instead of a bot (change the user agent then). Perhaps the site requires a login (you'd need to maintain cookies). Who knows. You can figure this all out with a real webbrowser like Firefox or Chrome.

继续阅读：android jsoup

How to parse content with <pre>?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？