开发者

Trying to get Scrapy into a project to run Crawl command

I'm new to Python and Scrapy and I'm walking through the Scrapy tutorial. I've been able to create my project by using DOS interface and typing:

scrapy startproject dmoz

The tutorial lat开发者_C百科er refers to the Crawl command:

scrapy crawl dmoz.org

But each time I try to run that I get a message that this is not a legit command. In looking around further it looks like I need to be inside a project and that's what I can't figure out. I've tried changing directories into the "dmoz" folder I created in startproject but that does not recognize Scrapy at all.

I'm sure I'm missing something obvious and I'm hoping someone can point it out.


You have to execute it in your 'startproject' folder. You will have another commands if it finds your scrapy.cfg file. You can see the diference here:

$ scrapy startproject bar
$ cd bar/
$ ls
bar  scrapy.cfg
$ scrapy
Scrapy 0.12.0.2536 - project: bar

Usage:
  scrapy <command> [options] [args]

Available commands:
  crawl         Start crawling from a spider or URL
  deploy        Deploy project in Scrapyd target
  fetch         Fetch a URL using the Scrapy downloader
  genspider     Generate new spider using pre-defined templates
  list          List available spiders
  parse         Parse URL (using its spider) and print the results
  queue         Deprecated command. See Scrapyd documentation.
  runserver     Deprecated command. Use 'server' command instead
  runspider     Run a self-contained spider (without creating a project)
  server        Start Scrapyd server for this project
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy

Use "scrapy <command> -h" to see more info about a command


$ cd ..
$ scrapy
Scrapy 0.12.0.2536 - no active project

Usage:
  scrapy <command> [options] [args]

Available commands:
  fetch         Fetch a URL using the Scrapy downloader
  runspider     Run a self-contained spider (without creating a project)
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy

Use "scrapy <command> -h" to see more info about a command


The PATH environmental variables aren't set.

You can set the PATH environmental variables for both Python and Scrapy by finding System Properties (My Computer > Properties > Advanced System Settings) navigating to the Advanced tab and clicking the Environment Variables button. In the new window, scroll to Variable Path in the System Variables window and add the following lines separated by semi-colons

C:\{path to python folder}
C:\{path to python folder}\Scripts

example

C:\Python27;C:\Python27\Scripts

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜