开发者

PHP application design

I have to do a Scraper that will scrap about 100 URL's, the Scraper must to run in a PHP CLI called by a CronJob. I开发者_JAVA技巧'm totally lost on how to manage this... for each URL I'm thinking to create a new file just to get things clear when I must to update code for a specific URL.

This could be a good option? Then, it is possible to call all this files from a single CronJob?


You would want those 100 urls to be managed easily, by storing them in a database or a text-file. Then simply load all the urls, loop through them and call your scrape function.


What you can do is,

Maintain the list of all 100 URLs along with the Alias name (could by anything, say 'Google' for http://google.com) in Database.

Create file for each URL with following naming conventions 'Alias name.php', Write code to parse URL in that file.

Now you can invoke one Cronjob which will retrieve all of your URLs from the Database. You can loop through each URL an execute the file with corresponding Alias name.

For example. If your URL is : http://google.com and Alias for it is Google. For that you need to create file named Google.php, Write the code for Scrapping. In cron job you will have code something like

$urls = getAllURLs();    
foreach($urls as $url){

include_once($url['alias'].".php");

}

Hope this will help.

Thanks!

Hussain

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜