PHP Get Site's Google Ranking WITHOUT Crawling Google [closed]
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this questionI want to programmatically retrieve Google Search Results for the purpose of finding where a specific domain ranks in the search results pages.
HOWEVER, I do NOT want to simply crawl search result pages because I'm expecting high volume and needing to do this frequently, and this is judged as abuse by Google if I understand correctly?
Most scripts/classes I've been able to find try to parse the HTML pages, there's gotta be a better way.
Is there an 开发者_如何学运维API to get google results? Any ideas?
Thanks!
first you should understand something: there is not "a" ranking. the SERP you see when googling your keywords is not the same SERP other people see when googleing your keywords. they are a sh*tload of "personalization" factors (location, cookie enabled, instant search, day time, previous searches, web history, datacenter, ....) that come into account of where something ranks. for some popular keywords the top 3 are kinda static, 5 to 10 in a flux, after ten it gets really really fuzzy, after 20 its like throwing a dice.
and that is just the crawl the google serps approach.
it gets worse with the web search api (deprecated but working) or the custom search api (== crap == d*ckmove by google).
so whatever you do, you will always just get a near meaningless snapshot of the google results.
and no, there is no other offical API.
that was the bad news, now the good news ... if you worry about your own domain, just go to "google webmaster tools" and click on "search queries". that's the best information you can get (it's still fuzzy, but it's what you get found for, where you rank on average). or you can apply some specialized google anlaytics filter to check the rank postion of google referred traffic.
if you want to analyse your competition, well there are a lot of search marketing companies which sell exactly that kind of service (most of them are specialized per market, i.e. in germany it's sistrix, there are a sh*tload of such companies in the us).
but as i said before: the data is a meaningless snapshot and most of the time just not actionable.
they were offering a free api , few months ago. but now its Deprecated.
you can try their new Custom Search API.
limitations : only 100 free queries / day.
This can be done with crawling google SERP by using different proxies with random sleep time between requests and reading and sending cookies for localized results and proper set of user agents ,, i follow this approach, i use a proxy farm of 300 proxies and i can crawl any website all day long without getting blocked,, there is a lot of tips you can follow to prevent getting blocked like avoid accessing webpages sequentially: /page/1, /page/2, etc. And don't request a new webpage exactly every N seconds. Both of these mistakes can attract attention to your web requests because a real user browses more randomly. So we need to make sure to crawl webpages in an un-ordered manner and add a random offset to the delay between requests.
I don't like the selected answer.
First of all it is too generalizing, there IS a SERP rank and it mostly depends on language and country.
The other factors are rarely a real factor and very minor (for example in Google you can favorite your websites, they will be ranked on top).
I've personally made a lot of testing and when I scrape keywords from foreign countries and ask people from there I get the exactly same ones.
Now the central problem
The Google custom search API is not an option for rank tracking, it's useful for a small amount of data research only.
Same counts for Bing and both are really expensive for larger amounts.
If you want to get that ranking data you only can either scrape (crawl) the search engines, that's definitely possible (I do it) or you use a scraping service which does it for you and delivers raw data to your software (also using that one myself).
As you said you don't want to crawl yourself, take a look at scraping.services.
That's a scraping service mostly designed for developers if I am not mistaken, you can make a full featured rank tracking engine that way for high volumes of keywords.
They also have some API module for generating charts and reports (different from sistrix but same same sort of sauce) if you don't want to do it yourself.
Personally I have not used their keyword tracker yet, I used Google and Bing their scraping API directly and it works without issues so far.
If you get interested in scraping search engines on your own I can help you out, it's not difficult (also possible for large volumes).
精彩评论