search copies of data from all over internet

2023-01-30 09:25 问答作者：

i need your help and want advice as developer point开发者_Python百科 of view that how people are running like sites like copyscape.com bascially they search copies of data on whole internet i want to know how they are searching and making catalog of all website from internet same like google as google makes index of site from internet

please guide me how they are searching data from all over internet how its possible to keep track of each and every website on internet how google knows that there is new site on internet from where there crawlers knows that new website is launched so in short i want to know how can i develop a site in which i can search copies of data all over internet with out depending on any third party api plzzz advice me i hope you will help me

thanks

Google's crawlers don't know when a new site is launched. Usually developers must submit their sites to Google or get incoming links from sites that are indexed.

And nobody has a copy of the entire Internet. There are websites that are not linked and never get visited by any crawler. This is called the deep web and is generally inaccessible to crawlers.

How do they do it exactly? I don't know. Maybe they index popular sites where text is likely to be copied, like Blogger, ezinearticles, etc. And if they don't find the text on those sites, they simply say its original. Just a theory and I am probably wrong.

Me? I would probably use Google. Just take a good chunk of text from the website you are checking is copied and then filter out the results that are from the original website. And viola, you have the website that have that exact phrase which is presumably copied.

继续阅读：search

search copies of data from all over internet

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？