How to scrape the same url in loop with Scrapy
Needed content is located on the same page with a static URL.
I created a spider that scrapes this page and stores the items in CSV. But it does so only once and then finish the crawling process. But I need repeat the operation continuously. How can I do this?
Scrapy 0.12
Python 2开发者_如何转开发.5
Well giving you a specific example is kind of tough because I don't know what spider you're using and the internal workings of it, but something like this could work.
from scrapy.http import Request
class YourSpider(BaseSpider):
# ...spider init details...
def parse(self, response):
# ...process item...
yield item
yield Request(response.url, callback=self.parse)
You are missing dont_filter=True. Following is example.
import scrapy
class MySpider(BaseSpider):
start_urls = ('http://www.test.com',)
def parse(self, response):
### Do you processing here
yield scrapy.Request(response.url, callback=self.parse, dont_filter=True)
I code this way:
def start_requests(self):
while True:
yield scrapy.Request(url, callback=self.parse, dont_filter=True)
I have tried the way below, but there is a problem that when the Internet is unstable, It will stop and will break the loop.
from scrapy.http import Request
class YourSpider(BaseSpider):
# ...spider init details...
def parse(self, response):
# ...process item...
yield item
yield Request(response.url, callback=self.parse)
精彩评论