Omitting values from certain keys in a dict
I am currently building a location-based service that calculates routes for users sharing cars to a specific event. In order to calculate the shortest distance, the driving distances between the users need to be known, because one of the constraints of the system is that each driver shouldn't go more than a certain distance out of their way to pick up a particular passenger. To avoid calling the Google Maps API twice for the same route, I populate a Dict at the beginning of the program to store the distances. The distances are generated like so:
def generateDistances(self):
users = self.drivers + self.passengers
for user1 in users:
for user2 in users:
if user1 != user2:
distance = GetDistance(user1.location, user2.location)
self.distances.append({'Start' : user1, 'End' : user2, 'Distance' : distance['Distance']['meters'], 'Duration': distance['Duration']['seconds']})
self.distances.append({'Start' : user1, 'End' : self.destination, 'Distance' : distance['Distance']['meters'], 'Duration': distance['Duration']['seconds']})
The GetDistance method just fetches the route between the two locations from the Google Maps API, based on their latitudes and longitudes. The program then calls the following function to find a distance in the Dict:
def getSavedDistance(self, user1, user2):
if user1 == user2:
return 0
for record in self.distances:
if record['Start'] == user1:
if record['End'] == user2:
return record['Distance']
logging.warn("No distance from %s to %s found" % (user1.userid, user2.userid))
However, I have been running this on the Google App Engine and it's running very slowly, and as you can imagine the running time grows exponentially as the problem size is increased (i.e. more users). What I want to do is initialise the dict with straight line distances between each user (calculated mathematically, without needing the API calls), and when the system is testing the length of a route it would test the straight line distance first. If the straight line distance is more than the maximum distance then the route is too long - the actual distance doesn't need to be calculated. Otherwise, the system would simply see that the driving distance isn't in the dict, and make the necessary API calls to put it in there.
So, I've come up with something like this to initialise the distances (note that this doesn't work as I can't insert null into the dict values):
def initialiseDistances(self):
users = self.drivers + self.passengers
for user1 in users:
for user2 in users:
if user1 != user2:
self.distances.append({'Start' : user1, 'End' : user2, 'Distance' : null, 'Duration' : null, 'StraightLine' : GetStraightLineDistance(user1.location, user2.location)})
self.distances.append({'Start' : user1, 'End' : self.destination, 'Distance' : null, 'Duration' : null, 'StraightLine' : GetStraightLineDistance(user1.location, self.destination)})
...and then the getSavedDistance method could be changed to something like this:
def getSavedDistance(self, user1, user2):
if user1 == user2:
return 0
for record in self.distances:
if record['Start'] == user1:
if record['End'] == user2:
if record['Distance'] == null:
distance = GetDistance(user1.location, user2.location)
record['Distance'] = distance['Distance']['meters']
record['Duration'] = distance['Duration']['seconds']
return record['Distance']
logging.warn("No distance from %s to %s found" % (user1.userid, user2.userid))
This would allow the system to only populate the distance values that are actually used, and avoid making the same API call twice. However, apparently I can't insert null into a dict value. Does anyone have an idea for a way I could insert some value into this dict that tells me there is no value for the distance yet?
开发者_如何学编程Thanks
Make your self.distances
a dictionary mapping a (start_user, end_user) tuple to the info that you want. What you are doing involves O(N) accesses to list items just for a single lookup, instead just 1 dict lookup. With a dict, if you don't have any info for (user1, user2), you don't need to waste time and memory putting a dummy "null" entry into your data structure.
info = self.distances_DICT.get((user1, user2))
if info is None:
self.calculate_the_distance_or_whatever_else_you_need_to_do(user1, user2))
As this is Python, None
is the null value. Compare to None
using is None
, not == None
though.
May I suggest a different approach? Make your self.distances a dict with (user1, user2), which changes your lookup from O(n) to O(1). Assuming GetDistance(user1, user2)
is the same as GetDistance(user2, user1)
, you could make sure that each tuple used as a dictionary key is sorted so that you can re-use the same value for each direction.
Expanding on John Machin's point, an idiomatic way of writing something like that in Python might look like:
class DistanceFinder(object):
distances = {}
def GetDistance(self, user1, user2):
userkey = (user1, user2)
if userkey in self.distances:
return self.distances[userkey]
result = [... calculations go here ...]
self.distances[userkey] = result
return result
Fun work-alike in Python 3.2:
from functools import lru_cache
class DistanceFinder:
@lru_cache(maxsize=None)
def GetDistance(self, user1, user2):
return [... calculations go here ...]
That caching stuff comes built-in. Nice, huh?
精彩评论