Websocket Server with twisted and Python doing complex jobs in the background
I want to code a Server which handles Websocket Clients while doing mysql selects via sqlalchemy and scraping several Websites on the same time (scrapy). The received data has to be calcul开发者_运维技巧ated, saved to the db and then send to the websocket Clients.
My question ist how can this be done in Python from the logical point of view. How do I need to set up the code structure and what modules are the best solution for this job? At the moment I'm convinced of using twisted with threads in which the scrape and select stuff is running. But can this be done an easier way? I only find simple twisted examples but obviously this seems to be a more complex job. Are there similar examples? How do I start?
Cyclone, a Twisted-based 'network toolkit', based on/similar to facebook/friendfeed's Tornado server, contains support for WebSockets: https://github.com/fiorix/cyclone/blob/master/cyclone/web.py#L908
Here's example code:
- https://github.com/fiorix/cyclone/blob/master/demos/websocket/websocket.tac
Here's an example of using txwebsocket:
- http://www.saltycrane.com/blog/2010/05/quick-notes-trying-twisted-websocket-branch-example/
You may have a problem using SQLAlchemy with Twisted; from what I have read, they do not work well together (source). Are you married to SQLA, or would another, more compatible OR/M suffice?
Some twisted-friendly OR/Ms include Storm (a fork) and Twistar, and you can always fall back on Twisted's core db abstraction library twisted.enterprise.adbapi. There are also async-friendly db libraries for other products, such as txMySQL, txMongo, and txRedis, and paisley (couchdb).
You could conceivably use both Cyclone (or txwebsockets) and Scrapy as child services of the same MultiService, running on different ports, but packaged within the same Application instance. The services may communicate, either through the parent service or some RPC mechanism (like JSONRPC, Perspective Broker, AMP, XML-RPC (2) etc), or you can just write to the db from the scrapy service and read from it using websockets. Redis would be great for this IMO.
Ideally you'll want to avoid writing your own WebSockets server, but since you're running Twisted, you might not be able to do that: there are several WebSockets implementations (see this search on PyPI). Unfortunately none of them are Twisted-based [Edit see @JP-Calderone's comment below.]
Twisted should drive the master server, so you probably want to begin with writing something that can be run via twistd
(see here if your'e new to this). The WebSocket implementation mentioned by @JP-Calderone and Scrapy are both Twisted -based so they should be reasonable trivial to drive from your master Twisted-based server. SQLAlchemy will be more difficult, I've commented on this before in this question.
精彩评论