Ant colony behavior using genetic programming
I'm looking at evolving ants capable of food foraging behaviour using genetic programming, as described by Koza here. Each time step, I loop through each ant, executing its computer program (the same program is used by all ants in the colony). Currently, I have defined simple instructions like MOVE-ONE-STEP
, TURN-LEFT
, TURN-RIGHT
, etc. But I also have a function PROGN
that executes arguments in sequence. The problem I am having is that because PROGN
can execute instructions in sequence, it means an ant can do multiple actions in a single time step. Unlike nature, I cannot run the ants in parallel, meaning one ant might go and perform several actions, manipulating the environment whilst all of the other ants are waiting to have their turn.
I'm just wondering, is this how it is normally done, or is there a better way? Koza does not seem to mention anything a开发者_开发技巧bout it. Thing is, I want to expand the scenario to have other agents (e.g. enemies), which might rely on things occurring only once in a single time step.
I am not familiar with Koza's work, but I think a reasonable approach is to give each ant its own instruction queue that persists across time steps. By doing this, you can get the ants to execute PROGN
functions one instruction per time step. For instance, the high-level logic for the time step of an ant can be:
Do-Time-Step(ant):
1. if ant.Q is empty: // then put the next instruction(s) into the queue
2. instructions <- ant.Get-Next-Instructions()
3. for instruction in instructions:
4. ant.Q.enqueue(instruction)
5. end for
6. end if
7. instruction <- ant.Q.dequeue() // get the next instruction in the queue
8. ant.execute(instruction) // have that ant do its job
Another similar approach to queuing instructions would be to preprocess the set of instructions an expand instances of PROGN to the set of component instructions. This would have to be done recursively if you allow PROGNs to invoke other PROGNs. The downside to this is that the candidate programs get a bit bloated, but this is only at runtime. On the other hand, it is easy, quick, and pretty easy to debug.
Example: Say PROGN1 = {inst-p1 inst-p2}
Then the candidate program would start off as {inst1 PROGN1 inst2} and would be expanded to {inst1 inst-p1 inst-p2 inst2} when it was ready to be evaluated in simulation.
It all depends on your particular GP implementation.
In my GP kernel programs are either evaluated repeatedly or in parallel - as a whole, i.e. the 'atomic' operation in this scenario is a single program evaluation. So all individuals in the population are repeated n times sequentially before evaluating the next program or all individuals are executed just once, then again for n times.
I've had pretty nice results with virtual agents using this level of concurrency. It is definitely possible to break it down even more, however at that point you'll reduce the scalability of your algorithm:
While it is easy to distribute the evaluation of programs amongst several CPUs or cores it'll be next to worthless doing the same with per-node evaluation just due to the amount of synchronization required between all programs.
Given the rapidly increasing number of CPUs/cores in modern systems (even smartphones) and the 'CPU-hunger' of GP you might want to rethink your approach - do you really want to include move/turn instructions in your programs?
Why not redesign it to use primitives that store away direction and speed parameters in some registers/variables during program evaluation? The simulation step then takes these parameters to actually move/turn your agents based on the instructions stored away by the programs.
- evaluate programs (in parallel)
- execute simulation
- repeat for n times
- evaluate fitness, selection, ...
Cheers, Jay
精彩评论