Hey all! For the past month or so, I've been tackling one of the biggest technical problems in my new game, Dicey Dungeons - improving the enemy AI enough for the final release of the game. It's been pretty interesting, and lots of it was new to me, so I thought I'd write a little bit about it.
First up, a sort of disclaimer: I'm not a computer scientist - I'm just one of those people who learned enough about programming to make video games, and then stopped learning anything I didn't have to learn. I can usually muddle through, but a real programmer probably wouldn't have approached all this the way I did.
I tried to write all this in a fairly high level approach in mind, so that hopefully the basic ideas all make sense to other non-programmers. But I'm for sure no expert on all this stuff, and if I've gotten any of the details wrong in explaining the theory, let me know in the comments - happy to make corrections!
Let's start by explaining the problem!
The problem
If you've not played Dicey Dungeons, here's a crash course: it's a deckbuilding RPG, where each enemy has a selection of equipment cards that do different things. Also, they roll dice! They then place those dice on the equipment to do damage, or cause various status effects, or heal, or shield themselves from damage, or lots of other things. Here's a simple example of a tiny frog using a big sword and a little shield:
A more complicated example: this Handyman has a spanner, which allows it to add two dice together (so 3 + 2 would give you a single 5, and a 4 + 5 would give you a 6 and a 3). It also has a Hammer, which "shocks" the player if they use a six on it, and a Pea Shooter, which doesn't do much damage, but which has a "countdown" which persists across turns.
One more important complication: there are status effects which change what you can do. The most important of these are "Shock", which disables equipment at random until you unshock it by using an extra dice on it, or "Burn", which sets your dice on fire. When your dice are on fire, you can still use them - but it'll cost you 2 health points. Here's what a clever Handyman does when I shock and burn all his equipment and dice:
There's more to it than that, of course, but that's basically the gist of it!
So, the problem: how do you make an AI that can figure out the best thing to do on it's turn? How does it know which burning dice to extinguish, which dice to use for unshocking and which dice to save for important equipment?
What it used to do
For a long time, my AI in Dicey Dungeons just had one rule: It looked at all the equipment from left to right, figured out the best dice to use on it, and used it. This worked great, until it didn't. So, I added more rules.
For example, I dealt with shocking by looking at the unshocked equipment, and deciding what dice I would want to use on it when it was unshocked, then marking that dice as "reserved" for later. I dealt with burning dice by just checking if I had enough health to extinguish them, and choosing whether or not to do it by random chance.
Rule after rule after rule to deal with everything I could think of, and ended up with an AI that sorta kinda worked! Actually, it's amazing how well this hodge-podge of rules held together - the AI in Dicey Dungeons might not have always done the right thing, but it was definitely passable. At least, for a game that's still a work in progress.
But over time, this system of adding more and more rules to the AI really started to break at the seams. People discovered consistent exploits to get the AI to do stupid things. With the right setup, one of the bosses could be tricked into never actually attacking you, for example. The more rules I added to try to fix things, the more weird things would happen, as rules started to conflict with other rules, and edge cases started to crop up.
Of course, one way to fix this was to just apply more rules - work through each problem one by one, and add a new if statement to catch it. But I think that would have just been kicking the problem further down the road. The limitation this system had was that it was only ever concerned with this question: "What is my next move?". It could never look ahead, and figure out what might happen from a particular clever combination.
So, I decided to start over.
The classic solution
Look up AI stuff for games, and likely the first solution you'll come across is a classic decision making algorithm called Minimax. Here's a video that explains how it's applied to designing a Chess AI:
Implementing Minimax works like this:
First, you create a lightweight, abstract version of your game, which has all the relevant information for a particular moment in time of the game. We'll call this the Board. For Chess, this would be the current position of all the pieces. For Dicey Dungeons, it's a list of dice, equipment, and status effects.
Next, you come up with a value function - a way to measure how well the game is going for a particular configuration of the game - i.e. for a particular board. For Chess, maybe a board where all the pieces are in their initial positions is worth 0 points. A board where you have captured an enemy Pawn is maybe worth 1 point - and maybe a board where you've lost one of your own Pawns is worth -1 points. A board where you have your opponent in checkmate is worth infinity points. Or something like that!
Then, from this abstract board. you simulate playing all the possible moves you can make, which gives you a new abstract board. Then, you simulate playing all the possible moves from those boards, and so on, for as many steps as you want. Here's an excellent illustration of that from freecodecamp.org:
What we're doing is creating a graph of all the possible moves both players can make, and using our value function to measure how the game is going.
Here's where Dicey Dungeons splits from Minimax: Minimax comes from mathematical game theory, and it's designed to figure out the best series of moves in a world where your opponent is trying to maximise their score. It's so named because it's about trying to minimise your loss when your opponent plays so to as to maximise their gain.
But for Dicey Dungeons? I actually don't care what my opponent is doing. For the game to be fun, you just want the AI do make moves that make sense - to figure out the best way to play their dice on their equipment to make it a fair fight. In other words, all I care about is the Max, not the Min.
Which means: for the Dicey Dungeons AI to make a good move, all I need to do is create this graph of possible moves, and look for the board which has the best score - then make the moves that lead to that point.
A simple enemy turn
Ok, examples! Let's look at this frog again! How does it decide what to do? How does it know that it's chosen action is the best one?
It basically just has has two options. Place the 1 on the broadsword and the 3 on the shield, or do it the other way around. It obviously decides that it's better off putting that 3 on the sword than the 1. But why? Well, because it looked at all the outcomes:
Place the 1 on the sword and you end up with a score of 438. Place the 3 on it, and you end up with a score of 558. Great, ok! Then, I get a better score by placing the 3 on the Sword, done.
Where's that score coming from? Well, the Dicey Dungeons scoring system currently considers:
Damage: The most important case - 100 points for every point of damage dealt.
Poison: An important status effect that the AI considers almost as important as damage - 90 points for each poison.
Inflicting other Status effects: Like Shock, Burn, Weaken, etc. Each one of these is worth 50 points.
Bonus status effects: Inflicting yourself with positive status effects like Shield, etc, is worth 40 points each.
Using equipment: Using any piece of equipment is worth 10 points - because if all else fails, the AI should just try to use everything.
Reducing countdowns: Some equipment (like the Pea Shooter) just needs a total value of dice to activate. So, the AI gets 10 points for every countdown point it reduces.
Dice Pips: The AI gets 5 points for every unused Dice Pip - so a 1 is worth 5, and a 6 is worth 30. This is intended to make the AI prefer not to use dice it doesn't need to use, and does a lot to make its moves look more human like.
Length: The AI loses 1 point per move, making it so that long moves have very slightly lower scores than short ones. This is so that if there are two moves that would otherwise have the same score, the AI will pick the shorter one.
Healing: