Genetic State Trees

I’ve just moved to Tokyo and I start work at Square-Enix next week. My frequency of blogging is inconsistent at best, and will probably only get worse. I doubt I’ll be able to blog about anything I’m working on, and work generally provides the inspiration for my posts.

At any rate, today I just want to jot down some ideas I had while on the plane: a sketch of a method to use genetic techniques to evolve statechart based AI behavior.

First, some quick background: City Conquest, by Intelligence Engine Design Systems, is a tower defense game with some real time strategy elements, and Paul Tozour -- one of the company’s founders -- wrote a program called Evolver which was used extensively during development.

The main thing an AI can do in City Conquest is to build a building at a location. Evolver, therefore, generated good series of build commands for the AI. It did this by pitting millions of sets of randomly generated sequences of build commands against each other in-game. The intent was not only to create a set of challenging AI strategies for player to contend with, but also provide game balance feedback and automated testing for the team during development. The team would look at what strategies evolved, and discover unfair exploits or disadvantages in certain building types.

Search

Genetic programming is a process of search. The technique winnows pools of candidates down to find those that best match some predetermined “fitness score”.  ( Evolver, for instance, searched for strategies where a capital city ended with the greatest number of health points. ) Note: This is different than Darwinian evolution because, in the real world, populations expand as well as contract and, in the real world, there is no single concept of fitness. ( Perhaps we might say, Darwinian evolution propagates *any* runnable program. Other running programs, and the context of the world at large, influence what it means to remain runnable. ) Genetic programming is an elimination of possibilities. It's a death match search.

Programs as Trees

In Steven Gustafson’s thesis, “Diversity in Genetic Programming”, he describes the common practice of “evolving” Lisp expressions.

Since Lisp is a functional language, a function call chain can be represented as a tree. Functions are nodes of the trees, and leafs are individual values. Each node has a type based on the function’s return value. For instance, an expression: Add( 5, 4 ) could be be represented as a tree. “Add” would be a function node, “5” and “4” would be value leaves. In this case, the type of the function node would be “scalar value”. ( If this reminds you of anything, then maybe you’ve read my blog before. :) )


To evolve ( ie. search for ) a program that satisfies a particular goal, you need a bunch of candidate programs to search through. Rather than starting with all possible combinations of programs ( re: Turning; this space is very large ), genetic “cross-breeding” swaps sub-trees after each generation to increase the size of the search pool. Cross-breeding is essentially a search space optimization. Successful strategies are used as starting points for new programs.


Only in special circumstances can you know when “evolution is complete.” Instead, you simply run the search X number of million times, and cross your fingers a good solution will pop out.  As a side note: this can lead to finding “local minima” -- successful strategies, that could in theory be surpassed by a strategy not in the pool of generated candidates. Such is (the game of) life.



Trees of Commands

Saving functions as commands as I’ve previously outlined in my blog brings two interesting tools to the table. First, you don’t have to program in Lisp :) . A command-based language can look just like a procedural language. ( Under-the-hood, however, it's stored as a functional tree. ) Second, the inconsistency between “value leaf” and “function node” is removed. The system only knows about commands. The Add command data stores two scalar function value references, while the Make Value command replaces the stored value. The parameter of each Make Value is a single scalar value, just the parameters of an Add are two command references. This uniformity in node type opens the door for terminal nodes to do more interesting things. Commands could generate values over time, they could read input from users, and so on.


Command based nodes also open the door for storing programs that aren’t strictly function based. For instance, for each command type we could have meta-data describing the variability that’s allowed during cross-breeding. If, for instance, our program contained a  command for how many shots an AI unit should fire in a burst, we could specify a range to restrict evolutionary variations. Maybe, for instance, cap the number of shots fired based on knowing how many projectiles the game can draw at one time. ( Probably a silly metric for burst firing, but you get the idea. )



Statechart Trees

Going one step further, it would be extremely interesting to store statecharts as command trees, and then evolve them. 

ye old stopwatch example. again.
When a statemachine is executing, only one section of its hierarchy is active at a time. Transition statements switch between child states. This does not work in a traditional, function-like manner -- but it still can be described with commands. ( Note: the actions a state runs on entry and exit are function like.)

a statechart execution over time.
But, if we render the parent-child hierarchy as a tree, we could then apply genetic mechanisms to statecharts. We can injecting mutations into either the values of existing actions or conditions; we could crossing-breed ( swap ) substates; we could swapping actions; we could generating new sub-states and actions based on the total set of available conditions and functions. 


One concrete example of how this might work, think about evolving (searching) for what health level an AI should use a health potion. There could be a general category of logically good moments: when health is low; a general category of logically bad moments: when health is high; even nonsensical moments: triggered every time enemies heals themselves. But, rather than having to explicitly program this, we could see if the evolver can search out that fact for itself based on how well the AI succeeds in combat.

References

  • Gamasutra has the postmortem that initially sparked my interest. 
  • The details of Evolver are described in an interview by Alex Champandard.
  • Steven Gustafson’s thesis can be found on his website.

0 comments: