agent-based simulation: performance issue: Python vs NetLogo & Repast
I'm replicating a small piece of Sugarscape agent simulation model in Python 3. I found the performance of my code is ~3 times slower than that of NetLogo. Is it likely the problem with my code, or can it be the inherent limitation of Python?
Obviously, this is just a fragment of the code, but that's where Python spends two-thirds of the run-time. I hope if I wrote something really ineffi开发者_如何学Pythoncient it might show up in this fragment:
UP = (0, -1)
RIGHT = (1, 0)
DOWN = (0, 1)
LEFT = (-1, 0)
all_directions = [UP, DOWN, RIGHT, LEFT]
# point is just a tuple (x, y)
def look_around(self):
max_sugar_point = self.point
max_sugar = self.world.sugar_map[self.point].level
min_range = 0
random.shuffle(self.all_directions)
for r in range(1, self.vision+1):
for d in self.all_directions:
p = ((self.point[0] + r * d[0]) % self.world.surface.length,
(self.point[1] + r * d[1]) % self.world.surface.height)
if self.world.occupied(p): # checks if p is in a lookup table (dict)
continue
if self.world.sugar_map[p].level > max_sugar:
max_sugar = self.world.sugar_map[p].level
max_sugar_point = p
if max_sugar_point is not self.point:
self.move(max_sugar_point)
Roughly equivalent code in NetLogo (this fragment does a bit more than the Python function above):
; -- The SugarScape growth and motion procedures. --
to M ; Motion rule (page 25)
locals [ps p v d]
set ps (patches at-points neighborhood) with [count turtles-here = 0]
if (count ps > 0) [
set v psugar-of max-one-of ps [psugar] ; v is max sugar w/in vision
set ps ps with [psugar = v] ; ps is legal sites w/ v sugar
set d distance min-one-of ps [distance myself] ; d is min dist from me to ps agents
set p random-one-of ps with [distance myself = d] ; p is one of the min dist patches
if (psugar >= v and includeMyPatch?) [set p patch-here]
setxy pxcor-of p pycor-of p ; jump to p
set sugar sugar + psugar-of p ; consume its sugar
ask p [setpsugar 0] ; .. setting its sugar to 0
]
set sugar sugar - metabolism ; eat sugar (metabolism)
set age age + 1
end
On my computer, the Python code takes 15.5 sec to run 1000 steps; on the same laptop, the NetLogo simulation running in Java inside the browser finishes 1000 steps in less than 6 sec.
EDIT: Just checked Repast, using Java implementation. And it's also about the same as NetLogo at 5.4 sec. Recent comparisons between Java and Python suggest no advantage to Java, so I guess it's just my code that's to blame?
EDIT: I understand MASON is supposed to be even faster than Repast, and yet it still runs Java in the end.
This probably won't give dramatic speedups, but you should be aware that local variables are quite a bit faster in Python compared to accessing globals or attributes. So you could try assigning some values that are used in the inner loop into locals, like this:
def look_around(self):
max_sugar_point = self.point
max_sugar = self.world.sugar_map[self.point].level
min_range = 0
selfx = self.point[0]
selfy = self.point[1]
wlength = self.world.surface.length
wheight = self.world.surface.height
occupied = self.world.occupied
sugar_map = self.world.sugar_map
all_directions = self.all_directions
random.shuffle(all_directions)
for r in range(1, self.vision+1):
for dx,dy in all_directions:
p = ((selfx + r * dx) % wlength,
(selfy + r * dy) % wheight)
if occupied(p): # checks if p is in a lookup table (dict)
continue
if sugar_map[p].level > max_sugar:
max_sugar = sugar_map[p].level
max_sugar_point = p
if max_sugar_point is not self.point:
self.move(max_sugar_point)
Function calls in Python also have a relatively high overhead (compared to Java), so you can try to further optimize by replacing the occupied
function with a direct dictionary lookup.
You should also take a look at psyco. It's a just-in-time compiler for Python that can give dramatic speed improvements in some cases. However, it doesn't support Python 3.x yet, so you would need to use an older version of Python.
I'm going to guess that the way that neighborhood
is implemented in NetLogo is different from the double loop you have. Specifically, I think they pre-calculate a neighborhood vector like
n = [ [0,1],[0,-1],[1,0],[-1,0]....]
(you would need a different one for vision=1,2,...) and then use just one loop over n
instead of a nested loop like you are doing. This eliminates the need for the multiplications.
I don't think this will get you 3X speedup.
This is an old question, but I suggest you look into using NumPy for speeding up your operations. Places where you use dicts and lists which are logically organized (1-, 2-, 3-, or N-dimensional grid) homogenous data object (all integers, or all floats, etc) will have less overhead when represented and accessed as Numpy arrays.
http://numpy.org
Here is a relatively up to date comparison of NetLogo and one version of Repast. I would not necessarily assumed Repast is faster. NetLogo seems to contain some very smart algorithms that can make up for whatever costs it has. http://condor.depaul.edu/slytinen/abm/Lytinen-Railsback-EMCSR_2012-02-17.pdf
精彩评论