开发者

Any ideas of what more web page meta information I can use to classify a page relevance for some theme? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.

Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.

Closed 9 years ago.

Improve this question

I'm doing an algorithm to classify the relevance of a page for some theme like 'movies' using all meta information as possible, but excluding the textual content of the body.

I want to know what can I use to determine if a page has some info about the theme.

At the moment, I'm giving an importance of 40% for the title, 30% for the link after the domain, 20% for the domain and 10% for the meta keywords, but I think I can use more thing to be more precise. I'm matching some words with some weighting to calculate the relevance of the page.

A开发者_开发技巧ny ideas of what more can I use to calculate the relevance? I only want to exclude the text-content inside HTML itself, but the HTML structure can be used.


I think you should think about the Main Menu links , and if is the case a Submenu links , so to make it more simple , LINKS . And you should also take in count the metadata . But still i em not sure what are you trying to achieve .

From what i understood you are trying to make some "relevancy" formula for a webpage .

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜