How to translate from SQL to NoSQL/MapReduce?
I have a background working with relational databases but recently started to dabble in CouchDB and was surprised by how some non-relational operations, which would be simple in SQL, were not first-class functions in CouchDB.
I would appreciate you taking a moment to map each SQL statement below to its MapReduce equivalent.
SELECT COUNT(*) FROM products WHERE price < 20.00;
SELECT category, SUM开发者_C百科(price) FROM products GROUP BY category;
UPDATE products SET price = 19.99 WHERE price = 20.00;
DELETE FROM products WHERE expires_at <= NOW();
The SELECT
commands are pretty easy. Bulk writes are a bit more complicated. Generally, you'll use some view to retrieve the documents that need to be changed, then you'll use the _bulk_docs
API to send all the changes at once.
Also, consult the documentation regarding views for details for how to issue queries. This includes ordering, grouping, etc.
SELECT COUNT(*) FROM products WHERE price < 20.00;
Map
function (doc) {
if (doc.price < 20) {
emit(doc.price);
}
}
Reduce
_count
If you need this to work with an arbitrary amount, not just 20, then you'll need to emit the price in all cases, and use startkey
and endkey
to narrow down your resultset.
SELECT category, SUM(price) FROM products GROUP BY category;
Map
function (doc) {
emit(doc.category, doc.price);
}
Reduce
_sum
This map function essentially uses the category as the key, with the price as the value in your key/value pair. The reduce function will add up the prices for each different key.
UPDATE products SET price = 19.99 WHERE price = 20.00;
Map
function (doc) {
if (doc.price == 20) {
emit(doc.price);
}
}
Once your application pulls down the contents of this view, you'll perform all the manipulations in your application code, then send back the results into the database via the _bulk_docs
API.
DELETE FROM products WHERE expires_at <= NOW();
Map
function (doc) {
emit(doc.expires_at);
}
Depending on how your store your date-time values, you may need to adjust the map function as well as your query to the view. Using a timestamp (JS uses milliseconds instead of seconds) is probably the fastest way to accomplish this. Once you've set up your query, you'll add a new field to each of these documents. _deleted: true
. Once you send this list back into the database (again with _bulk_docs
) all the specified documents will be deleted.
精彩评论