Welcome to rabbit hell! Reliable AI locomotion with TDD

May 23, 2019
protect

Test-driven development (TDD) is a software workflow where code is written alongside small “unit test” programs that quantify whether each component is working. While automated unit tests are hard to apply to game development due to randomness, 3D interaction, and unpredictable player input, we were able to utilize a TDD workflow to write stable and regression-proof creature locomotion for ElemenTerra

Virtual rabbits navigating a series of "unit test" obstacle courses

The Basics of Test-Driven Development

Those experienced in TDD can skip this section; for everyone else, here is a basic primer.

Let’s imagine we’re coding a custom function that adds two numbers. In a traditional workflow we would just write it the way we think it should work and move on. But to use TDD, let’s instead start the process by making a placeholder function and some unit tests:


// Our placeholder function that produces wrong results
int add(int a, int b){
 return -1;
}
// Our unit tests that throw errors unless "add" produces correct results:
void runTests(){
 if (add(1, 1) is not equal to 2)
   throw error;
 if (add(2, 2) is not equal to 4)
   throw error;
}

Initially, our unit tests will fail because our placeholder function returns -1 for every input. But now we can go on to correctly implement add, and have it return a + b; all our tests will succeed! This may seem like a roundabout way to do things, but there are a few advantages:

  • If we didn’t get enough sleep and write add as a - b, our tests still fail and we’ll immediately know to fix the function. Without the tests we might not catch our mistake, and experience strange behavior that takes time to debug later on.

  • We can keep our tests around and run them every time we build our code. This means that if another coder accidentally changes add in the future, they’ll know immediately that they need to fix it because the tests will once again fail.

This is all unnecessary for this simple example, but with complex features like predictable state-machine behavior (after eating 100 food, is isFull true?) TDD saves time and improves a program’s stability.

TDD Application in Game Development

There are two problems with TDD in game development. First, many game features have subjective goals that defy measurement. And second, it’s difficult to write tests that cover the entire possibility space of worlds full of complex interacting objects. Developers who want their character movement to “feel good” or their physics simulations to “not look jittery” will have a hard time expressing these metrics as deterministic pass/fail conditions.

However, I believe that workflows more loosely based on TDD principles can still be applied to complex and subjective features like character movement, and in our game ElemenTerra we did just that.

Unit Tests vs Debug Levels

Before I get into my TDD practice, I want to make the distinction between an automated unit test and a traditional “debug level.” It’s a common practice in gamedev to create hidden scenes with contrived circumstances that allow programmers and QA professionals to witness specific events.

A secret debug level full of different objects in The Legend of Zelda: The Wind Waker

A secret debug level full of different objects in The Legend of Zelda: The Wind Waker. Image Source

We have many of these In ElemenTerra: a level full of problematic geometry for the player character, levels with special UIs that trigger certain game states, etc. Like unit tests, these debug levels can be used to reproduce and diagnose bugs, but a few aspects separate the two.

  1. Unit tests divide systems into their atomic parts and evaluate each individually, whereas debug levels test features on a more holistic level. After observing a bug in a debug level, developers still may need to search for the point of failure manually.

  2. Unit tests are automated and should produce deterministic results every time, whereas many debug levels are “piloted” by a player. This creates variance between sessions.

None of this is to suggest that unit tests are strictly superior to debug levels; in many cases the latter is a more practical tool. But I also believe that unit testing is underutilized in game development, and should be explored further with systems to which it is not traditionally applied.

Welcome to Rabbit Hell!

In ElemenTerra, players use mystical nature powers to rescue the creatures hurt by a cosmic storm. One of those powers is the ability to create pathways out of the ground which guide creatures to food and shelter. Because these pathways are dynamic player-created meshes, the creature locomotion needs to handle strange geometric edge cases and arbitrarily complex terrain.

Character movement is one of those nasty systems where “everything affects everything else”; if you’ve ever implemented such a system, you’ll know that it’s very easy to break existing functionality when writing new code. Need the rabbits to climb small ledges? Fine, but now they’re jittering up and down slopes! Trying to get your lizards to avoid each others’ paths? It looks like that works, but now their normal steering is all messed up.

As the person responsible for both the AI systems and most of the gameplay code I knew I didn’t have a lot of time to be surprised by bugs. I wanted to immediately catch regressions as they came up, and so Test-Driven Development seemed appealing. The next step was to set up a system where I could easily define each creature locomotion use case as a simulated pass/fail test:

Top-down view of the Rabbit Hell unit test scene

This “Rabbit Hell” scene is composed of 18 isolated corridors, each with a creature body and a course designed to be traversable only if a specific locomotion feature is working. The tests are considered successful if the rabbit is able to continue indefinitely without getting stuck, and considered failures otherwise. Note that we’re only testing the creatures’ bodies (“Pawns” in Unreal terms), not their AI. In ElemenTerra creatures can eat, sleep, and react to the world, but in Rabbit Hell their only instructions are to run between two waypoints.

Here are a few examples of these tests:

1, 2, 3: Unobstructed Movement, Static Obstacles, and Dynamic Obstacles

1, 2, 3: Unobstructed Movement, Static Obstacles, and Dynamic Obstacles

8 and 9: Even slopes, and Uneven Terrain

8 & 9: Even slopes, and Uneven Terrain

10: "Navmesh Magnet" failsafe for floating creatures

10: "Navmesh Magnet" failsafe for floating creatures

 

13: Reproduction for a bug where creatures would infinitely circle around nearby targets

13: Reproduction for a bug where creatures would infinitely circle around nearby targets

 

14 and 15: Step-Up ability on flat and complex ledges

14 & 15: Step-Up ability on flat and complex ledges.

Let’s talk about the similarities and differences between this implementation and “pure” TDD.

My system was TDD-like in that:

  • I started features by making failed tests, and then wrote the code needed to pass them.

  • I kept running old tests as I added new features, preventing me from pushing regressions to source control.

JikGuard.com, a high-tech security service provider focusing on game protection and anti-cheat, is committed to helping game companies solve the problem of cheats and hacks, and providing deeply integrated encryption protection solutions for games.

Read More>>