开发者

Crawling website with dynamic pages

I need to crawl websites and extract some information from dynamically created pages after a form submission.

The information which I need to crawl would mo开发者_开发技巧stly come from databases on these sites.

Added:

Crawlers usually work by jumping from one hyper-link to another. So these are mostly static pages. What about crawling pages that are not statically present but created on the fly.


From crawler's point of view there's no big difference. You're still getting genrated HTML.

The only thing you need to be careful about is links leading to infinite number of pages, e.g. calendar that's dynamically generated and has links to next/previous month/year.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜