What data store technology/solution allows very fast inserts, lookups and 'selects'
Here's my problem.
I want to ingest lots and lots of data .... right now millions and later billions of rows.
I have been using MySQL and I am playing around with PostgreSQL for now.
Inserting is easy, but before I insert I want to check if that particular records exists or not, if it does I don't want to insert. As the DB grows this operation (obviously) takes longer and longer.
If my data was in a Hashmap the look up would be o(1) so I thought I'd create a Hash index to help with lookups. But then I realised that if I have to compute the Hash again every time I will slow the process down massively (and if I don't compute the index I don't have o(1) lookup).
So I am in a quandry, is there a simple solution? Or a complex one? I am happy to try other datastores, however I need to be able to do reasonably compl开发者_如何学运维ex queries e.g. something to similar to SELECT statements with WHERE clauses, so I am not sure if no-sql solutions are applicable.
I am very much a novice, so I wouldn't be surprised if there is a trivial solution.
Nosql Stores are good for handling huge inserts and updates
MongoDB has really good feature for update/Insert (called as upsert) based on whether the document is existing.
Check out this page from mongo doc
http://www.mongodb.org/display/DOCS/Updating#Updating-UpsertswithModifiers
Also you can checkout the safe mode in mongo connection. Which you can set it as false to get more efficiency in inserts.
http://www.mongodb.org/display/DOCS/Connections
You could use CouchDB. Its no SQL so you can't do queries per se, but you can create design documents that allow you to run map/reduce functions on your data.
精彩评论