problem in reading <TITLE> tag from web page in java

2023-03-06 08:21 问答作者：

I am using jtidy parser to parse the web page. It is working, sort of:

InputStream in=new URL("http://www.medicinenet.com/alopecia_areata/article.htm").openStream();
Document doc= new Tidy().parseDOM(in, null);
String titleText=doc.getElementsByTagName("title").item(0).getFirstChild().getNodeValue();

It is working fine for <title>...</title>, but the url which I passed, it contains title tag <TITLE>...</TITLE> in capital letter. So开发者_开发百科 it is returning null.

How to read <TITLE>...</TITLE> & <title>...</title> in one statement using java code? Please help me.

Just check for null, then check uppercase

String titleText=doc.getElementsByTagName("title").item(0).getFirstChild().getNodeValue();
if (titleText == null) titleText=doc.getElementsByTagName("TITLE").item(0).getFirstChild().getNodeValue();

getElementsByTagName is case sensitive, so this is the simplest option.

继续阅读：jtidy

problem in reading <TITLE> tag from web page in java

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？