PHPCrawl: exclude urls anding with ?query=
I'm playing with PHPCrawl 开发者_StackOverflowand I'd like to know if anybody knows if it possible to exclude from crawling all the URLS with parameters (either if they are .html or .php)like
domain.com/article.html?showComment=1289420017718
Add a non-follow match pattern for any URL containing a question mark:
$crawler->addNonFollowMatch(".*\?.*")
I just foudn myself this works better
$crawler->addNonFollowMatch("/\?/");
精彩评论