Satisfactory: Network Optimizations

July 18, 2019
protect

Hi everyone!

Ever been wondering how much work might go into the seemingly small, and relatively vague bullet points on the update list? In particular, the ones in this list related to network optimizations? 

Well, either way, I’m here to tell you a little about it. It’s actually a very big and pretty complicated topic, even though the results can be described in relatively simple terms.

The Problem

As you all probably know, Satisfactory is a game about managing and handling tons of buildings and resources. This creates a very unique problem that most multiplayer games don’t have to handle, as most multiplayer games only need to keep track of a few other players (or nowadays, about 100 other players) and whatever comes out the other end of their gun. Which is very different from handling a base with over 2000 conveyors, transporting thousands upon thousands of items every second. On top of that, those items move in clear-to-see patterns, meaning that it is easy to see when they end up wrong. For us this means there is less room for simplifications. Opposed to, for example, when a player at a distance of 100 meters happens to aim their gun 10 degrees wrong.

This is just a portion of a base, but pretty much everything you can see here has states that change over time. These states need to replicate for all players simultaneously.
This is just a portion of a base, but pretty much everything you can see here has states that change over time. These states need to replicate for all players simultaneously.

The Solutions

To solve this there are essentially two solutions. Solution A would be to initially synchronize the whole map to all clients fully, and then do a local simulation on all clients’ machines. This will keep everything 100% predictable at all times, and will only send and sync information when unpredictable events, such as player input, happen.

Solution B would instead run everything primarily on the server while clients send their input to the server, which would then replicate its state to all clients. Similar to how most shooters handle this.

Both methods have their pros and cons. So let’s dive a little bit deeper and explain which solution we are actually using, and why.

So like we said, solution A relies on simulating everything locally after the initial full sync. This requires us to keep everything 100% predictable/deterministic at all times. Which in itself is a bit of work, but doable, and something we want to strive for either way. In this case it would not only be a goal, but an absolute requirement. The slightest inaccuracy could cause a chain reaction, resulting in the entire game-state being desynchronized beyond repair for everyone. Keeping things deterministic for factories and such is one thing, but keeping the physics simulations for complex actors such as vehicles perfectly deterministic on all clients is really hard, if not impossible, without rewriting parts of the engine. Meaning those would likely require a special workaround solution.

What makes this even more complicated, however, is the sending and syncing information when unpredictable events, such as player input, happen. To make this work you would then also have to run everything in lock steps. Lock steps ensure the game does not advance a frame before everyone can confirm they received the input messages from all players for the given frame, which would result in stutters for all other players if just one player is running behind.

An alternative to lock steps is requiring all players to keep multiple copies of the entire game state for every frame up until 2 seconds back in time. This enables them to rewind and simulate forward again when input from a laggy client arrives. Multiple copies of the game state would eat away memory, while rewinding and simulating forward multiple frames in a single frame would cause a lot of CPU overhead and potentially performance spikes.

All are solvable problems, but require a lot of work. It also makes adding new features more difficult, as you need to consider all these cases and systems.

We didn’t choose this method. We instead went with solution B. Which, like most multiplayer games, runs everything on the server, and then sends and replicates a representation of that to all clients when needed. But this requires us to be really smart with how we use the available bandwidth, since just replicating everything straight up will be impossible in a game like this. 

So what we’ve been working with in the months since launch has been the following:

  • Minimizing what data is sent, by figuring out when data is actually needed and only send it at that time. Things the player doesn’t notice are not needed.

  • Reducing how often the data is sent. If things don’t change often or are more predictable, it can be sent less often.

  • Compressing and minimizing the amount of data needed to represent a game state.

On top of all this there is also the problem of evaluating all the objects separately for each client on the server-PC, since which data is prioritized and needed will be different for every client. This means a lot more work for the server which can lead to greater CPU load, which needs to be minimized to keep the performance of the server as good as possible.

If these things are not done properly, the server can be performing a lot worse than the clients. Clients can experience a lot of extra latency (much greater than the actual ping), there can be a lot of dropped packets (will explain what this is shortly), things will start to get choppy, and things might look strange in general for the clients. Essentially all kinds of issues can occur.

To what extent this can be addressed or fixed varies greatly from case to case, so to give a better understanding of these concepts I plan to dive into a few concrete examples of improvements we have made for this patch.

Terms

To get us all on the same page, let’s cover a few common and maybe less common terms that we are going to use.

  • Packet: A bunch of data sent over the network as a chunk. This is how any and all information is sent between clients and servers.

  • Packet loss: When data is not being received or sent properly, resulting in holes in the intended data. Which can cause all kinds of issues if not handled.

  • Ping: is the time it takes to send a data packet to another PC and get a response back. Usually measured in ms (milliseconds).

  • Latency: The time it takes from initiation of an action until it’s actually performed/the user gets feedback of it happening.

  • Send Rate: How much data is aimed to be sent per second.

  • Send Buffer: A chunk of memory that is written to, which the PC will read from and consume when sending packets.

  • Bufferbloat: when too much data is written to the buffer, making it take too long to consume and work through the data. Think of it like a queue. Sometimes this leads to an extreme latency increase.

  • Buffer Overflow: When you try to write too much data to a buffer so some of it doesn’t fit and actually gets dropped. Can result in what seems like dropped packets.

  • Replication: The act of replicating the state of something on a client. It’s often not the whole state, but it’s enough to give the correct impression of the server’s state.

  • Serialization: The act of taking a complex data structure and writing it to a linear stream of data, which can be sent in a packet to later be turned back into a data structure.

We will use a few other terms as well, but let’s cover those as we encounter them.

What we’ve done

So, when starting on this update we had a pretty big plate of things to fix. There were many issues with the net code at launch, resulting in all kinds of issues imaginable. Most of which we’ve attempted to solve now, but a few that we have plans to follow up on again a little later. Soon™.

The first overarching and biggest problem was simply the raw amount of data that was necessary to be sent, growing larger as the factory grew. Large bases and factories would  cause an always-full send buffer, meaning bufferbloat, and even worse, possibly causing overflow.

To understand what a bloated buffer means for the game, think of the buffer as a queue that data needs to go through before it is actually sent. So a full/long buffer means data has to wait around longer before it is actually sent, thus increasing latency, even on our highest priority messages (like player input and movement). Ideally, a responsive game should write little enough data so it is all consumed before the next update. Making it so that the new packets can start writing at the start of the buffer (the queue). Meaning the newest data from the latest frame gets through as soon as possible.

So, our number one goal was to reduce the amount of data sent during gameplay as much as possible. To describe that work we can take a look at two particular cases that perfectly encapsulate the issues we had and the work we’ve been doing.

Only sending what can be seen

First-off we have a low hanging fruit with huge gains: all the factories and buildings with inventories.

With the initial replication the whole building was seen as one object, meaning if we replicated any part of it, like the production indicator lamp, we needed to replicate all of it. In turn this meant we ended up having all buildings’ inventories replicated to all clients at all times, even if they could not be seen.

For some buildings this was not a big issue, as its state rarely changed (meaning no data or at least very little data was needed). But most buildings have their inventory connected to a conveyor for input and output, meaning it is likely to change all the time. In fact many times a second in many cases.

 


As you can see in the menu for this building here, there is an inventory for the building that constantly changes and updates. This applies to every building that stores items or produces them and is connected to one or more Conveyor Belts. That’s a lot of data!

JikGuard.com, a high-tech security service provider focusing on game protection and anti-cheat, is committed to helping game companies solve the problem of cheats and hacks, and providing deeply integrated encryption protection solutions for games.

Read More>>