How we developed robust AI in DwarfCorp
DwarfCorp is an open-source (MIT licensed) base building strategy game from Completely Fair Games, slated for release in September 2017. While the game is in development, we provide free builds on our website. The game is inspired by Dwarf Fortress by Bay 12 Games, and of course Minecraft, among other similar games in the same genre.
In DwarfCorp, the player manages a colony of capitalist Dwarves who mine, collect resources, build and fight in a 3D procedurally voxel world.
The Dwarves are controlled through a simple RTS-style control scheme where the player selects employees and assigns them tasks. In a game like this, the units under the player's control have to be largely autonomous. That is, they should make their own decisions and respond to changes in the world in real-time. Since DwarfCorp has such a complicated world (3D destroyable terrain, water, lava and other hazards, enemy creatures that can harass the Dwarves, etc.) this is not an easy task. After we started working on the game in late 2012, we've gone through several re-designs of the AI system.
In this article, I'd like to give an overview of what worked and what didn't. And, since the game is open-source, you can read along in our source code to get an idea of what I'm talking about. Warning: this is a technical article full of jargon and code.
Task Executation vs. Management
The first thing that we have to make clear is what we mean by AI. The AI system in DwarfCorp is divided into two parts: the task management system, and the task execution system. Roughly speaking, the task management system assigns tasks among many Dwarves, while the task execution system figures out how to get one dwarf to complete one task.
Task Execution
The task execution system answers the question "How do I accomplish my goal?".
Given some task (for example, gathering an item or mining a block) we want to know the sequence of actions that the Dwarf should take to accomplish that goal, and we want to control exactly how those actions are carried out. We also want to do this in a robust way.
The Dwarf might get interrupted during the execution of the task. The terrain might change. Things might disappear. The Dwarf might get set on fire. I don't know, and neither does the Dwarf ahead of time -- but he/she needs to figure out how to respond should any of those things happen.
Your first thought might be "Okay, I will just write a script that tells the Dwarf how to accomplish its task". That's fine, but also very labor intensive and complicated, especially if you think about all the things that could go wrong while the dwarf is following the script.
You might also wonder what the best way to script the AI system within a game loop. Should the script be in a thread? Should it work on a tick-by-tick basis? Should it be state-based? Should it be asynchronous? Event-driven? How much should be generated on the fly, versus pre-scripted by hand?
Eventually you will settle on one of the basic patterns of AI development. All of the patterns work -- but all of them have certain flaws that you should know about before using them. All of these questions plagued DwarfCop early in development, and we explored a lot of alternatives before settling on something that worked for us.
First Attempt: State Machines
If you've spent any time AI programming, you've probably come accross Finite State Machines (FSMs), and its no wonder. They're a very old concept, and are extremely simple and general. That can make them fairly expressive and powerful. When we were starting out, this seemed like the obvious choice when it comes to task execution, but, as we shall see they have flaws which make them difficult to maintain and write. With an FSM, you've got certain "States" that the agent can be "In", and a series of "Transitions" between these "States" based on certain conditions. Within each "State", the agent performs a certain action until it can transition out of the state to another one.
To design the AI, you define a bunch of states, and hook them up using a transition table. This is a frankly tedious process that takes a lot of skill to get right. Our FSM class started out looking something like this:
// A state is a condition that the Dwarf can be in, and tells it how to behave. class State { public string Name; // The agent that's in the State. public AI Agent; // Do stuff when the state starts.. public Func<void> OnEnter; // Update the state one tick at a time. Called in the game update loop. public Func<void> void OnTick; // Do stuff when the state exits (like cleaning up) public Func<void> OnLeave; }
and then you might have a State Machine class that looks like this:
// Represents a collection of states and transitions between them. class StateMachine { public List<State> States {get; set;} // Represents the conditions under which a state transitions from one to another. public class Transition { // The state to transition from public State Source; // The state to transition to public State Destination; // The conditions (as a list of functions) that must hold before // the transition will take place. public IEnumerable<Func<bool> > Conditions; // The state should transition to another when all the conditions are met. public bool ShouldTransition() { return Conditions.All(condition => condition()); } } // A map from states to their transition criteria. public Dictionary<State, Transition> TransitionTable {get; set;} private State CurrentState {get; set;} public void SetState(State state) { if (CurrentState != null) { CurrentState.OnLeave(); } CurrentState = state; state.OnEnter(); } // Update the state machine. public void Tick() { if(CurrentState != null) { // Determine if the state machine should transition to the next state. var transitions = TransitionTable[CurrentState]; if (transitions.ShouldTransition()) { SetState(transitions.Destination); } CurrentState.OnTick(); } } }
And then you'd set up a state machine like this:
AI dwarf; // Some state. State firstState = new State() { Name = "First State", Agent = dwarf, OnTick = () => { Console.Out.WriteLine("Hello from first state!"); } OnEnter = () => { Console.Out.WriteLine("Entering first state!"); } OnExit = () => { Console.Out.WriteLine("Exiting first state!"); } }; // Some other state. State secondState = new State() { Name = "Second State", Agent = dwarf, OnTick = () => { Console.Out.WriteLine("Hello from second state!"); } OnEnter = () => { Console.Out.WriteLine("Entering second state!"); } OnExit = () => { Console.Out.WriteLine("Exiting second state!"); } }; // We will keep track of this iteration variable to determine when to transition states. int iter = 0; StateMachine machine = new StateMachine() { States = new List<State> { firstState, secondState } TransitionTable = new Dictionary<State, Transition>() { // Transitions for the first state. { firstState, new Transition() { Source = firstState, Desitination = secondState, // Transition to the second state when the number of iterations exceeds 5. Conditions = new List<Func<bool> >() { () => { return iter > 4; } } } } // Transitions for the second state. { secondState, new Transition() { Source = secondState, Destination = firstState, // Transition back to the first state when the number of // iterations exceeds 10. Conditions = new List<Func<bool> > () { () => { return iter > 9; } } } } } } // Initialize the state machine. machine.SetState(firstState); // Update the state machine. for (iter = 0; iter < 11; iter++) { machine.Tick(); }
The output of this "simple" machine will be
Entering first state! Hello from first state! // iter = 0 Hello from first state! // iter = 1 Hello from first state! // iter = 2 Hello from first state! // iter = 3 Hello from first state! // iter = 4 Exiting first state! Entering second state! Hello from second state! // iter = 5 Hello from second state! // iter = 6 Hello from second state! // iter = 7 Hello from second state! // iter = 8 Hello from second state! // iter = 9 Exiting second state! Entering first state! Hello from first state! // iter = 10
Even using a bunch of code-shortening tricks like lambdas and explicit initialization, that's a lot of work for something that is the equivalent of two for loops, and its not immediately clear what the behavior of this machine should be at first glance. If that sounds bad, imagine what happens when we have to do something more complicated. Here is a real example of a State Machine we wrote in 2012 for building an item:
This is one of the simpler FSMs we had. Notice the structure of the graph. There is a catchall state called "Idle" which must have transitions to all other states in case of failure, and has to keep track of what has succeeded or failed. Almost all the transitions in the graph are entirely based on whether or not the previous state succeeded and not on some intrinsic true state of the Dwarf or World. This kind of graph certainly works, but its kind of tedious to write.
We quickly realized that FSMs were just not going to scale, and we went searching for alternatives which would make writing complex, robust behaviors easier.
Second attempt: Action Planning
In my day job, I do robotics research. At around the time that I was messing with FSMs in DwarfCorp, I was working at the Personal Robotics Lab at Carnegie Mellon. The lab's primary focus is on an academic subject called planning. I've you've ever worked with game AI, you've probably done at least some planning -- for example your everyday A* algorithm is a simple path planning algorithm on graphs.
Planners take a state space, consisting of a set of numbers or symbols that describe the world and agent, and a set of actions that take you from one state to another given pre-conditions and post-conditions, to produce a path from some starting state to some goal state.
In the case of the simple A* graph planner, the state space consists of nodes in a graph, and the actions are edges between the nodes. If we restrict the graph to a grid, and connect things together based on their proximity, you wind up with a vanilla grid planner. Long before we started looking at alternatives to FSMs, we were already using a graph planner to tell Dwarves how to get from point A to point B. The graph consisted of nodes which were free voxels in our 3D world, and the edges were actions (such as Walk, Jump, Fly, and Climb) which connected these voxels together. Clearly this was a kind of task execution system. The Dwarf has a goal: get from one place to another, and a set of commands it can issue to reach that goal.
Could the whole idea of task execution be rolled directly into the planner? Could all dwarf actions be described as simply as walking or jumping from one voxel to another?
It turns out this is a very old idea, though not quite as old as Finite State Machines. Newell and Simon were looking at the idea in the late 50's and called the concept the "General Problem Solver", though the applications of their approach were limited to proving theorems. Perhaps the first useful action planner was STRIPS in 1971, which was used to command the proto-robot Shakey at the Stanford Research Institute.
The idea behind STRIPS is pretty simple. You start out with a series of symbols that define your state space. You then define a series of actions on those symbols which take you from one state to another. Actions must have pre-conditions, the set of symbols which must be a certain value before the action can execute, and post-conditions, the set of symbols whose values change based on the action.
By defining all the states and actions you can derive a graph called the state-action graph. You then plan in this graph exactly as you would using A* (although you have to carefully consider what kinds of heuristics to use, because the state space is very different from something like a grid). From there, your robot (or in my case Dwarf) can figure out how to do basically anything on its own, without having to write out a tedious transition table.
From the game industry, a popular algorithm called GOAP (Goal-oriented action planning) is basically an implementation of STRIPS and was used in some famous games like F.E.A.R. Let's see if we can write the "build item" task as an action planning domain. First we start by defining the state space as a set of true or false assertions. By default these assertions will be false:
// Determines whether the dwarf has the given resource in the inventory dwarf_has(dwarf, resource): true/false // Determines whether the dwarf is at the given place dwarf_at(dwarf, place): true/false // Determines whether the dwarf can build the given item using a resource can_build(resource,item): true/false // Deteremines whether the given resource is at a place. resource_at(resource, place): true/false //Determines whether the given item has already been built. item_built(item): true/false // Determines whether the