One question that I keep coming back to again and again is whether references between objects are best represented as names or GUIDs.
Here is the situation: You have created some sort of data model for representing objects in memory/on disk. Now you need the ability for objects to refer to other objects. I.e., an object needs to talk about another object. Some examples:
A material object may point to a texture object and say “I want to use this as my diffuse map”.
An animation object may point to a model object and say “I want to rotate this model around its z-axis”.
How can we accomplish this?
Here are two options:
Names: Each object is referred to by its name. The name is a string assigned to the object by the user and the user can change this string at will (rename the object).
GUIDs: Each object is referred to by a globally unique identifier (GUID). The GUID is assigned to the object on creation and never changes. It is guaranteed to only represent this particular object and no other.
Names are resolved in some kind of context (typically the children of the current object). Thus, to refer to an object that is “far away” from us we might have to use a sequence of names to navigate the object tree, e.g., ../../player/head/left_eye. Much like a path in a file system, this sequence of names provides a path from one object in our object tree to another. Note that in this post I will sometimes somewhat sloppily talk about the name of an object when I actually mean the full path to an object.
You might protest that there are other ways of representing references too. For example, an in-memory representation could just use a pointer. A disk representation could use a file offset. Combinations are possible too — for example (filename + offset) to represent an object inside a file. However, it is easy to become confused when considering the myriad of possibilities, so let’s put all of that aside for the moment. In this post, I’m going to focus on the difference between names and GUIDs and in the end we will see how the discussion applies to the other possibilities.
Side note: There is another interesting option apart from names and GUIDs and that is to refer to an object by the hash of its content. With this approach, the same content is always referred to by the same unique identifier (its hash) and if you change the content all the references have to be updated. If you start to think about it, most of git falls out as the result of this single design decision.
Names and GUIDs both have their pros and cons, making it hard to say that one is strictly better than the other:
Names | IDs |
---|---|
Fragile — if objects are renamed, moved or deleted, references will break | Unreadable — references look like random numbers which makes them hard to debug |
Cumbersome — coming up with meaningful names for everything is a chore |
Each of these points can be argued back-and-forth endlessly. Can’t we auto-assign names to make them easier to come up with? But how readable are names really if most of the things are just named box_723? Can’t we make a tool that looks up a readable name from a GUID? Can’t we also make a tool that automatically patches references when an object is renamed? Etc, etc, etc.
Again, it’s easy to get stuck in the nitty-gritty details of this and miss the bigger picture. To make things clearer, let’s take a step back and ask ourselves:
What is the fundamental difference between names and GUIDs?
Think about it for a bit. Here’s my answer:
A GUID specifies an object identity, but a name specifies an object’s role.
The GUID 90e2294e-9daf-45f0-b75b-01fb85bb6dc8 always refers to one specific object — the one single object in the universe with that GUID. The path head/left_eye refers to whatever object is currently acting as the character’s left eye. It does not always have to be the same object. Maybe the character loses her eye at some point and it gets replaced with a glass eye. Maybe we can spawn multiple instances of the character in different configurations with different kinds of eyes — flesh eyes, robot eyes, anime eyes, etc. Regardless of the setup, head/left_eye will refer to the character’s left eye.
In contrast, if we used a GUID to refer to the left eye and the eye got replaced, the GUID would still refer to the old eye we lost. And a single GUID couldn’t be used to refer to different eyes in different character setups.
Name
The pointers and offsets that I talked about in the beginning of the post are similar to GUIDs, since they reference objects by identity. A pointer always points to the same object. In fact, you could see a pointer as a deserialized version of a GUID — a way of uniquely referencing an object in memory. Offsets too, uniquely identify objects. (But offsets are not permanent, so references must be updated each time a file is saved.)