Designing an entity system with threading in mind
At the core of Tharsis, like many other entity systems, are Components and Processes (called Systems in many other frameworks). A Component a simple data struct with no methods. A Process is a class with a process() method called by Tharsis for every entity containing components matching its signature.
Components can’t have a copy constructor or a destructor. Tharsis generates code to load components, and moves them around by direct copying. Components can’t own any memory, but they can refer to Resources such as 3D models or sounds. I will likely write about resources in a future post.
The naive threaded approach is to run processes in separate threads, locking entities as they’re processed. Deadlocks are not an issue as long as entities are processed in the same order, but with large numbers of entities, locking overhead may negate the gains made by parallelization.
In the best case, threads never fight over an entity and we can expect overhead in 10s of nanoseconds per lock. With 30 processes and 5000 entities that adds up to 150000 locks per game update, or more than 1.5 milliseconds; roughly 10% of an update at 60FPS. In reality, threads will fight over data, making this overhead much greater, and worst of all, unpredictable.
Tharsis avoids this by using immutable data; if we keep a copy of state from the previous update and never change it, we can read it without locking.
Past and future
Tharsis has a concept of past state; processes read components from the previous game update and generate future state, which will be the past in the next game update. We have two copies of all game state, similar to double buffering in graphics; past and future buffers are switched between updates, reusing memory. This may seem wasteful, but most memory in a game is usually used by resources such as textures and sounds, not by the game state itself.
We still need to prevent processes to fight over future state they write. Tharsis requires no more than one process to write future state of any component type. This rule may seem limiting, but I found it straightforward to design processes with it in mind; in fact, it forces the code to be separated into a greater number of simpler, atomic processes. Greater process count also improves scalability as we can utilize more cores.
Separating past and future also removes a common problem of component-based entity systems; order the processes run influencing the game behavior.
For example, in a ‘traditional’ component-based entity system, process A multiplies matrices, process B changes health, and process C then starts with a different state than on the beginning of the game update. In Tharsis, a process cannot change the state read by the other processes during an update, since all processes read the immutable past.
The past-future distinction has an interesting side effect; components can be removed by not copying them into future state. This enables us to tightly pack components in arrays with no gaps, avoiding cache misses.
For every process, code generated by Tharsis reads these arrays and passes components to the process. If a process writes future components, references to a future component array are also passed. A process can also decide not to write a future component for an individual entity; this can be used to remove components. After processing all entities, we have an array of future components with no wasted space. Note that the future components are written into the past buffer from previous update, avoiding reallocations.
MultiComponents are component types that allow multiple components of that type per entity. They are passed to processes using D array slices. A MultiComponent type must specify maximum number of components per entity for Tharsis to preallocate enough space. MultiComponents would not be viable without separate past and future; we would need to either insert new components in the middle of a buffer or give up on array storage. Since future components are created as a process sequentially processes entities, they are always added to the end of the buffer.
In current games it’s common to use a small, fixed number of threads for game logic and background worker threads for easily threadable tasks. This works with a small, fixed number of cores (such as on a console), but doesn’t scale well to many-core machines we’re likely to see in future.
I hope that Tharsis will provide a way to program threaded games scalable to tens of CPU cores without manually managing threads. That said, there will always be some need for direct control over threads; especially with dependencies such as OpenGL and SDL. I plan to allow the user to force a process to run in a specific thread if needed.