scrapy is using HTTP 1.0 by default
It looks like Scrapy is using HTTP 1.0 by default. Is there a setting to make it us开发者_运维问答e HTTP 1.1 to send request?
Thanks.
From http://dev.scrapy.org/wiki/ScrapyRecipes:
How to spoof requests to be HTTP 1.1 compliant You can do this by overriding the Scrapy HTTP Client Factory, with the following (undocumented) setting:
DOWNLOADER_HTTPCLIENTFACTORY = 'myproject.downloader.HTTPClientFactory'
Here's a possible implementation of myproject.downloader module:
from scrapy.core.downloader.webclient import ScrapyHTTPClientFactory, ScrapyHTTPPageGetter
class PageGetter(ScrapyHTTPPageGetter):
def sendCommand(self, command, path):
self.transport.write('%s %s HTTP/1.1\r\n' % (command, path))
class HTTPClientFactory(ScrapyHTTPClientFactory):
protocol = PageGetter
精彩评论