Which key:value store to use with Python? [closed]
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
开发者_运维技巧 Improve this questionSo I'm looking at various key:value (where value is either strictly a single value or possibly an object) stores for use with Python, and have found a few promising ones. I have no specific requirement as of yet because I am in the evaluation phase. I'm looking for what's good, what's bad, what are the corner cases these things handle well or don't, etc. I'm sure some of you have already tried them out so I'd love to hear your findings/problems/etc. on the various key:value stores with Python. I'm looking primarily at:
memcached - http://www.danga.com/memcached/ python clients: http://pypi.python.org/pypi/python-memcached/1.40 http://www.tummy.com/Community/software/python-memcached/
CouchDB - http://couchdb.apache.org/ python clients: http://code.google.com/p/couchdb-python/
Tokyo Tyrant - http://1978th.net/tokyotyrant/ python clients: http://code.google.com/p/pytyrant/
Lightcloud - http://opensource.plurk.com/LightCloud/ Based on Tokyo Tyrant, written in Python
Redis - http://redis.io/ python clients: http://pypi.python.org/pypi/txredis/0.1.1
MemcacheDB - http://memcachedb.org/
So I started benchmarking (simply inserting keys and reading them) using a simple count to generate numeric keys and a value of "A short string of text":
memcached: CentOS 5.3/python-2.4.3-24.el5_3.6, libevent 1.4.12-stable, memcached 1.4.2 with default settings, 1 gig memory, 14,000 inserts per second, 16,000 seconds to read. No real optimization, nice.
memcachedb claims on the order of 17,000 to 23,000 inserts per second, 44,000 to 64,000 reads per second.
I'm also wondering how the others stack up speed wise.
That mostly depends on your need.
Read Caveats of Evaluating Databases to understand how to evaluate them.
shelve (storing dictonaris in file / standard python module)
ZODB - persisatnce object database (python objects database, no SQL)
More persistance tools: http://wiki.python.org/moin/PersistenceTools
My 5 cents:
Do you need distributed systems with tera byte sized data or massive write performance?
Well, they you need one of the big key:value/BigTable/Dynamo type things. That would by Cassandra, Tokyo Tyrant, Redis, etc. You need to make sure that the client library supports sharding so you can have multiple databases to write to. Which one to use here can only be decided by you after testing with data that looks like what you think you need.
Do you need the data to be accessible from other systems/languages than Python?
Since these databases have no structure to their data at all, if it's accessible from other languages/clients that yours depends on what you store in it. But if you need this CouchDB is a good choice, as it stores it's data a JSON documents, so you get interoperability. How good CouchDB is on really massive data and sharding is unclear though.
Do you need neither interoperability with other languages than Python or distributed multi-server storage?
Use ZODB.
How about Amazon's SimpleDB?
There is an open-source python library called boto for python interfacing Amazon web services.
精彩评论