Crawling and parsing Javascript elements

2022-12-15 02:36 问答作者：

I try to get info from a website which uses Javascript to show onclick the phonenumber of the items/companies.

Crawling that with PHP curl or xpath did not let me find a solution how to trigger this events and than keep on crawling.

Example:

<a onclick="show(2423,'../entries.php?eid=2423',1);

for info here the function too

function show(info_id,qpath,swimage){
expandit(info_id,0,swima开发者_如何学Pythonge);
if(document.getElementById('load_'+info_id)) {
    ajax_loadContent('cont_td_'+info_id,qpath);
}
 }

Is this possible to do with PHP/Xpath/DOM or what do you recommend to do to achieve this? Any chance for "debugging" the code to see which url to call?

THanks for your concern And have really great FESTIVITIES

It seems like all it's doing is an AJAX call to this page, ../entries.php?eid=2423.

Try going to that URL directly and you'll probably get your phone number without any HTML/JavaScript parsing.

You could use the net tab of firebug to keep an eye on what URLs are loaded. Or Fiddler. Once you work out the pattern you might be able to craft and call the same URLs yourself using curl.

Or, you could use one of the browser automation frameworks, like webaii or selenium or watir or watin and crawl the links that way.

Try using Selenium RC to simulate clicking on the link and then scan the page for results: http://seleniumhq.org/projects/remote-control/

继续阅读：javascript web-crawler

Crawling and parsing Javascript elements

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？