How to retrieve title and summary of web page by programme?

2022-12-14 18:42 问答作者：

Like what开发者_C百科 digg does,when you submit a news,the title and summary is automatically retrieved,how to do it?

Retrieve the HTML and parse it.

The title comes from the <title> tag. The summary can come from either:

The first couple of hundred characters of visible text from inside the <body> tag.
The description <meta> tag.

If the site provides an RSS feed (which you'll find in the <link rel="alternate" type="application/rss+xml"> tag) use the fielded information from that instead.

There is no one right answer to this question. There are probably other strategies possible. But this should get you started.

The title is easy just the title tag of the HTML the summary is a bit harder if you are retrieving this with some search or context you should try and generate the summary based on the position of the search term or something relative to the context you are showing this in. For example if you are showing this because I hit an "AI" tag show me some of the page that is about AI.

In the case of Digg title and Description can be edited by the poster before it is pushed out to everyone. But if the page has a meta tag of description it will pre-populate the field. They use the following meta tag <meta name="description" content="blah blah blah"/>

继续阅读：algorithm

How to retrieve title and summary of web page by programme?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？