Scripting with Data

Inspired by Niklas Frykholm’s series on a data-driven system for vector fields, I wanted to share a technique I use for scripting which I call "data-build scripting". It’s an admittedly poor term because it's not about scripting your art pipeline or data-build system, it’s about leveraging your build system to create your own powerful scripting languages.

I’ll tell you a bit about Nikas’ system, then introduce data-build scripting, and wrap up with some source code using Google’s protocol buffers to create a custom scripting language.

The advantage of data-build scripting is that you can create highly-targeted domain-specific languages that are an integral part of the very same data your scripts act on. The disadvantages are the same as any custom scripting language: no script level debugger, no general purpose tools for writing your scripts, and a new class of possible bugs related to script processing and memory management.

Vector Fields and DSL

Niklas’ goal in his series was to write a system which could simulate the effects of global and local forces ( ex. the wind on a map, or the explosion of a projectile ) on a large number of particles. His technique was to express those effects as mathematical functions and small code snippets. He then went on to write an HLSL-like language and a bytecode engine to be able to script it all via data. ( A perfect candidate for GPGPU programming, perhaps? )

Vector field scripting is an example of a Domain Specific Language. DSLs are mini, or toy, languages that exist to serve one narrow purpose. Unlike a general purpose language -- C, Python, Lua, etc -- you can’t build a whole program with a DSL. HLSL, for instance, doesn’t do anything by itself; shaders have to be managed by an executable program.

There are two broad classifications of DSLs: external, and internal. External DSLs -- like Niklas’ vector field language -- require a compiler. Internal DSLs use the host programming language and syntax to create a mini-language: sometimes with macros; sometimes with classes. This latter type has been called fluent programming and I’ve written about them before. Martin Fowler wrote a whole book about them.

Data-build scripting has elements of both. And while data-build scripting won’t provide you with the extensive language libraries of python or lua -- ie. there’s no import antigravity -- domain specific languages are meant to be highly targeted, and that it does well.

Saving Commands

The idea behind data-build scripting comes from the GOFs Command Pattern.  The command pattern, at it’s heart, is about recording the *desire* to make a function call in order to delay the call until a future time. The pattern calls for storing all the parameters needed for the function call in a class, and giving that class a single “execute” method which then triggers the actual call.

As a quick example, let’s create an NPC. ( Obviously, what’s needed to create an NPC in a game is a lot more complicated than this, but you get the gist. )

A direct function call to create an npc:
 NpcManager::Singleton()->add( new Npc( “BadGuy”, Vector( 0,0 ) ) );  
A command to create an npc:
  struct ICommand {
   virtual void execute()=0;
  };

  struct SpawnNpc:ICommand  {
    SpawnNpc( String npcName, Vector where ): npc(npcName), where(where) {}
    void execute() { 
      NpcManager::Singleton()->add( new Npc( npcName, where ) ); 
    }
  private:
    String npcName;
    Vector where;
  };
Now to use that command:
  ICommand* cmd= new SpawnNpc(“FamousGuy”,Vector(0,0));
  // and, sometime later....
  cmd->execute(); // a star is born.
How does this relate to data? Well, notice what just happened. We took a function and we turned it into a structure. If you leveraged your existing data serialization solution, you could store this command to disk.  All you would need to know to reconstitute the command is the type of class and the members of that class. The ability to do that exists somewhere in many game developers’ pipelines already.

The aspects related to behavior -- how to run this command -- is still written in our host language; in this case C++.  All we need to do to execute the command is match the data loaded from disk with the C++ class.  ( And, I’ll provide some code later to show how to do that with Google’s protocol buffers. )

Easy, right? But, saving a single command isn’t scripting. So let’s dig further.

Scripting with Commands

Builders and Factories are design patterns which use generic interfaces to generate concrete results. Taken together, they show simple classes and simple functions are, in some sense, interchangeable. Further, they imply we don’t need our Command’s execute() method to simply return void: we can have it produce meaningful values.

Since vector fields require vectors, and since vectors require floats, let’s start with floats.

A well-known function that returns a float:
  float sinf( float ); 

can be translated into a command data class just as easily as any other.
  struct IFloatCommand { 
    virtual float compute()=0; 
  };
  struct Sin:IFloatCommand {
    Sin( float angle ): angle(angle) {}
    virtual float compute() { return sinf(angle); }
  private:
    float angle;
  };
In Sin’s case -- like a factory -- we are in some sense “producing” a float. And -- like a command -- we are delaying the production of that float until we’re ready for it. Given our hypothetical ability to save and load an arbitrary command class, we now have the ability to save and load a function that produces floats. Extend that outward, and --- so long as we define a unique base class for every return type -- we can save any function we want.

Our interfaces define what is possible to create, our data tells the code how.
  struct Ninja:Npc {
     int numberOfShurikan;
     float angerLevel;
     //etc.
  };

  struct ISpawnNpc {
    virtual NpcInstance* spawn()=0;
  };

  struct SpawnCrazyNinja: ISpawnNpc {
    SpawnCrazyNinja( String npcName, Vector where, int numberOfShurikan );
    Npc * spawn() {
      Ninja* ninja= new Ninja( npcName);
      ninja->numberOfShurikan= 1; 
      // i’d be angry too if i didn’t only had one throwing star.
      ninja->angerLevel= 100;
      return ninja;
    }
  };

  // maybe data loading looks something like this:
  MissionData * file= openFile(“mission.dat”);
  ISpawnNpc* spawn= file->getFirstSpawn();
  Npc* npc= spawn->spawn();

Functional languages as object-oriented programming

Still not scripting you say. Okay, fine. You’re right. The difference between a series of functions and, say, programming is that programs ( scripts, whatever ) allow you to pipe the output of one function into the input of another, to store and retrieve values from memory, to branch on the results of functions, and … well... that’s about it actually. For now, I’m going to concentrate on just the first part -- piping outputs to inputs -- the other bits can be built upon that platform to the extent your particular scripting needs require.

Let’s define a new function. This function will take two values and multiply them together.
  struct Multiply:FloatCommand {
    Mul( float first, float second ) : first(first), second(second) {}
    float compute() const { return first*second; }
  private:
    float first, second;
  };
If we wanted we could then write:
  Sin a( 0 ), b( 1 ); 
  Multiply c( a.compute(), b.compute() )
  float result= c.compute();
That’s still a bunch of C++ to pipe two commands together. And, anyways, the repeated compute() is ugly, so let’s move compute() into the Multiply, and -- for the sake of argument -- let’s delay the calls to a.compute() and b.compute() until they are needed.
  struct Multiply:FloatCommand {
    Multiply( const FloatCommand* first, const FloatCommand * second ) 
      : first(first)
      , second(second) {
    }
    float compute() const { 
      return first->compute()+second->compute(); 
    }
  private:
    const FloatCommand* first, * second;
  };
Note: I’m using pointers here because Multiply doesn’t know what type of command we might be passing to it. I’ll address that shortly, but notice we do get some type-safety here. We can pass any FloatCommand to Multiply, but we couldn’t pass SpawnCrazyNinja. The compiler wouldn’t allow it.  Let’s actually use our new Multiply.
  Sin a( 0 ), b( 1 ); 
  Multiply c( &a, &b  )
  float result= c.compute();
Things still aren’t quite right here. Multiply takes functions, but Sin takes floats. We can fix that by defining a “MakeFloat” command. Now, we need a root somewhere -- a place to get constants in -- but relegating that to a command which exists specifically for that purpose makes the rest of the system more flexible.
  struct MakeFloat:FloatCommand {
     MakeFloat( float val );
     float compute() const { return val; } 
  };
  struct Sin:FloatCommand {
    Sin( FloatCommand* angle );
    float compute() const { return sinf( angle->compute(); }
  };
We can multiply sin values, or we can sin multiply values. Whatever we want:
  MakeFloat one(1), two(2);
  Multiply mul( &one, &two );
  Sin sin( &mul );
  float result= mul.compute(); // same as sinf( 1*2 )
If our data build system allowed us to store pointers, perhaps we would be close to being in business. All we would need to do is have a list of all the possible functions, the references to each other, pointer fixup for the data, and access to the “root” of the function tree.
  // maybe data loading looks something like this:
  MathData * file= openFile(“math.dat”);
  file->fixupPointers();
  IFloatCommand * runWhat= file->getFirstFloat();
  float result= runWhat>compute();
If you *don’t* happen to have your own data build system lying around -- where’s a programmer to turn?

In the next post, I’ll show you how to describe our float language with protocol buffers, how to generate a script of that data from Python, and how to run that script in an C++ app. You'll be able to use the same techniques to build your own scripting language that can operate on floats, vectors, npcs, or whatever combination of things you want.

Stay tuned!

0 comments: