Welcome to part 3 in this series covering all the data structures you really need (kind of).
In part 1, we saw that the best way to store most data is just to use a big old array and that with some tricks we can allocate this array so that the objects in it have fixed indices and permanent pointers.
In part 2, we saw how to fix the most glaring problem with these arrays — that it is kind of expensive to find things in them — by adding search indices that let us quickly find any subset of objects that we are interested in.
In this part, we will tackle a final problem. When we created our bulk data arrays, we assumed that all objects were fixed size so that we could fit their data into a single struct and store the bulk data as just an array of such structs:
struct object_t { ... }; uint32_t num_objects; struct object_t *objects;
But what if we need some fields in the object that are dynamically sized?
struct object_t { const char *name; uint32_t num_children; object_t **children; };
How can we store the object names and children in a good way?
Capped Size
The simplest way of handling dynamically sized objects is to just get rid of the problem. If we set a maximum size for our strings and arrays, we’re back to having a fixed size struct, with the string and array data stored in the struct itself:
enum {MAX_NAME_LENGTH = 63}; enum {MAX_CHILDREN = 8}; struct object_t { const char name[MAX_NAME_LENGTH + 1]; uint32_t num_children; object_t *children[MAX_CHILDREN]; };
Now we can just make an object_t[] as before and have all the data in there.
Capping sizes is always a bit scary, because what if we set the cap too low and hit the limit? Personally, I feel a little bit dirty every time I do this. I think it’s PTSD from run-ins with some really ridiculous size limits, such as MAX_PATH = 260 in Windows.
But there are a lot of situations where having a maximum size is fine. There is a big difference between locking a maximum size down in a file format or an OS API, where it might live for 50 years, causing headaches to generations of programmers, versus just having it in your runtime, where you can change it whenever you want without worrying about compatibility with old software. You can start with a fixed size and change to something more advanced when you need to. Code doesn’t have to be “future-proof” if it is easy for future you to change it.
Also, if you are writing code for just one particular game and not a generic engine, you don’t need to support anything beyond what your game needs. If you only support four players, your player arrays can all be players[4]. And anything that’s just for internal use, you can limit to whatever you find reasonable. For example, object debug names can probably be s[64] without any problems. Longer names are too hard to read anyway.
Also, keep in mind, that even if you don’t specify an explicit maximum size, there is probably a practical limit anyway. If you have too many things you will eventually overflow an uint32_t, run out of memory or drive the FPS down to the point where your app is unusable.
Judicious use of maximum sizes can simplify the code a lot. Of course, it doesn’t always work:
Some things can get arbitrarily large.
Some things have a limit, but the limit is so high that you will waste a lot of memory by using a fixed-size array.
Some things have a small limit, but enforcing the limit is more trouble than its worth.
As an example of the last case, consider your application’s window system. You could probably limit the number of windows to 128 because no sane user will want to deal with more than 128 open windows. (Note that this does not apply to tabs — many users are perfectly happy having a couple of thousand Chrome tabs open.) Using a fixed size of 128 won’t waste much memory either since you only need one global window array.
But… if you decide to set a limit of 128 windows, you have to decide what to do if the user tries to open more than that. You probably don’t want to crash. Do you beep? Show an error message? What if the window you can’t open is the window that was supposed to show the error message? Now you have this extra code path in your code, that you have to write, maintain and test. An extra place where things can go wrong. It might be easier to just support an “unlimited” number of windows and let the users dig their own graves.
Heap allocation
If we can’t use a fixed-size array, the next simplest thing is to just use an off-the-shelf STL object, such as std::string or std::vector.
struct object_t { std::string name; std::vector<object_t *> children; };
What happens when we use one of these types? STL will allocate data for the string and the vector on the system heap. Each string and vector gets its own individual little memory allocation, so we have a situation that looks something like this:

STL memory layout.
STL memory layout.
If we have an std::vector<object_t> with 10 000 items in it, we will use one memory allocation for the object_t * array, but an additional 20 000 allocations for the names and children of those objects.
This seems a bit wasteful, but computers are really fast, so this approach can work well in many cases. We have a fair number of arrays like this in The Machinery – not for strings (more about that later), but certainly for vectors. The only difference is that instead of using an std::vector we use a stretchy buffer. Why? Well, for one, we program in C rather than C++ so we can’t use STL. But even if we used C++ we might still prefer this approach, because C++ classes can be tricky to use with the bulk data allocation approaches I described in part 1.
First, remember how we stored the objects in an “array with holes” by making the stored data type a union of the original data we wanted to store and a free list pointer for keeping track of the free items:
typedef union { object_t; freelist_item_t; } item_t;
This doesn’t work with std::vector, because objects with non-trivial copy constructors can’t be stored in unions. To make this work with STL objects we either have to use extra memory to store the freelist pointers with the object itself, or use something like std::variant (shudder).
Second, remember our nice trick of just reserving a large chunk of VM memory to hold our bulk data array. This works well with our C data structures, because they are zero-initialized and the memory we get from the VM is always cleared to zero (since otherwise, the VM would leak information between processes). So, with C data structures, we can just use the memory directly, whereas with C++ we would have to placement-new our strings and vectors into the memory.
In other words, in this use case, using the STL data types adds a lot of extra bookkeeping for limited benefits.
As I said above, heap allocation can certainly be “good enough” in a lot of circumstances, but if you want to store a really large number of objects, it has some drawbacks that I’ve already talked about earlier in this series:
Making thousands of small allocations is time-consuming and can waste a substantial amount of memory on fragmentation, allocation headers, etc.
Since the arrays are allocated individually on the heap, they might be far apart in memory, resulting in cache misses when we process the objects.
With lots and lots of small allocations, it is hard to get a good picture of how much memory the system is using and what the access patterns are.
So let’s look at alternatives!
Private heap
One way of avoiding to make lots of small allocations from system memory is to make one big allocation and then chop it up ourselves into smaller pieces. Since we’re allocating a big chunk, we can request it directly from the VM and bypass the system heap allocator completely:

Private heap.
Private heap.
Essentially, what we’re doing here is implementing our own memory allocator. Getting chunks of memory from the VM and chopping them up into chunks to fulfill malloc and