I need help making a website crawler using php [closed]
I really want to make a website crawler that goes to a website, scans it for links, puts the links in a database and moves on to another website. I found one website but the code was really buggy. If you have seen anything like this or have written one your self.
You probably won't find anything suitable for PHP, as it is generally for short-running pages. Many severs, for example, are set to time out at 30 seconds. You can write PHP for command-line scripts, but I suspect that's not what you want.
Anywyay, if you want a pre-packaged solution, why care about the language?
I would recommend something like wget to crawl the sites and save them to disc. Then you can iterate over the files and directories, and pull out links. The hard bit is crawling the sites (it's not simple). You can write the code to pull out links without too much difficulty.
I found one, so if anyone is looking, here is the link: php-crawler
精彩评论