Mongo cannot find master during data lookups
I am running a large data update using pymongo. To run the updates, individual records are found using collection.find_one(unique criteria)
, changes are made, the updates are batched, and finally sent in chunks using db.collection.save([long list of records to save])
On my local machine (running 1.6.3), the imports work fine.
On a remote server (running 1.6.0), which is much faster than my local machine, I can get through a portion of the inserts just fine, but then will suddenly get the following error when looking up original records:
connection = Connection(...)
...
raise AutoReconn开发者_运维问答ect("could not find master/primary")
pymongo.errors.AutoReconnect: could not find master/primary
The number of records I can get through is varies somewhat, but is not random.
At first I thought I was running into the connection limit. I started closing connections manually after each record lookup:
collection.database.connection.disconnect()
Which didn't solve the problem. Am I on the right track?
So there are a couple of potential issues here:
raise AutoReconnect("could not find master/primary")
pymongo.errors.AutoReconnect: could not find master/primary
That error indicates that the existing connection has somehow been invalidated. There are a number of reasons this could happen.
The most common reason this happens is that that the Primary of a Replica Set has stepped down or has failed. In this case your code needs to:
- Catch (or trap) the error.
- Decide on a retry strategy. (fail? retry once?...)
Are you doing this? Are you running Replica Sets or Master/Slave? Do you have any tracking for the performance of these servers? Are they having network issues? Are they switching roles?
collection.database.connection.disconnect()
Which didn't solve the problem. Am I on the right track?
Where is the exception "happening"? Is it coming from the connection itself or the save command?
On a remote server (running 1.6.0)
As of this writing, 1.6.0 is a very old version of MongoDB. There were multiple replication bugs fixed in the subsequent 1.6.x versions and 1.7.x versions. (we're already at 1.8.1rc-0)
I would start by looking at what's happening with your servers, but that may well lead you down the upgrade path.
I've encountered this problem in interactive python usage with pymongo, where I leave the session idle and encounter AutoReconnect upon returning. I've handled it this way:
import functools
import pymongo
import time
MAX_AUTO_RECONNECT_ATTEMPTS = 5
def graceful_auto_reconnect(mongo_op_func):
"""Gracefully handle a reconnection event."""
@functools.wraps(mongo_op_func)
def wrapper(*args, **kwargs):
for attempt in xrange(MAX_AUTO_RECONNECT_ATTEMPTS):
try:
return mongo_op_func(*args, **kwargs)
except pymongo.errors.AutoReconnect as e:
wait_t = 0.5 * pow(2, attempt) # exponential back off
time.sleep(wait_t)
return wrapper
@graceful_auto_reconnect
def some_func_that_does_mongodb_ops():
...
...
YMMV
精彩评论