开发者

How search engines find websites over internet

I'm going to write a Web parser (an application that crawles on the web from one site to another).

How Can I find list 开发者_JAVA百科of available domains/IPs in the internet (as complete as possible)?

How search engines find websites (What they use as a reliable list of registred IP/Domains for starting point)?

Thanks


As Michael P's comment indicates, depends on what your objective is.

My company recently wanted to answer a question about third-party tools used on leading websites. I used Alexa as a starting point to find the top (by traffic) websites, and created a parser that can answer the specific question my company asked. If you start from such a list, you can program your web crawler to follow the links it encounters to broaden your knowledge of sites on the web.

Hopefully that helps you think about the problem.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜