开发者

Crawling and parsing Javascript elements

I try to get info from a website which uses Javascript to show onclick the phonenumber of the items/companies.

Crawling that with PHP curl or xpath did not let me find a solution how to trigger this events and than keep on crawling.

Example:

<a onclick="show(2423,'../entries.php?eid=2423',1);

for info here the function too

function show(info_id,qpath,swimage){
expandit(info_id,0,swima开发者_如何学Pythonge);
if(document.getElementById('load_'+info_id)) {
    ajax_loadContent('cont_td_'+info_id,qpath);
}
 }

Is this possible to do with PHP/Xpath/DOM or what do you recommend to do to achieve this? Any chance for "debugging" the code to see which url to call?

THanks for your concern And have really great FESTIVITIES


It seems like all it's doing is an AJAX call to this page, ../entries.php?eid=2423.

Try going to that URL directly and you'll probably get your phone number without any HTML/JavaScript parsing.


You could use the net tab of firebug to keep an eye on what URLs are loaded. Or Fiddler. Once you work out the pattern you might be able to craft and call the same URLs yourself using curl.

Or, you could use one of the browser automation frameworks, like webaii or selenium or watir or watin and crawl the links that way.


Try using Selenium RC to simulate clicking on the link and then scan the page for results: http://seleniumhq.org/projects/remote-control/

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜