Getting started with Unity DOTS — Part 2: C# Job System
In Part One we discussed the Entities preview package. However, to achieve its performance, DOTS also takes advantage of multithreading. The C# Job System package lets you write multithreaded code to distribute your workload on multiple cores. It also features tools to help you avoid common threading problems. Using it in combination with the Burst compiler can lead to improvements in both performance and battery life (on mobile devices).
Multithreading basics
A CPU core is capable of processing only one instruction at a time. To do that, it needs an execution context called a thread. So, if our game is running on a single thread, it would be processed sequentially.
However, modern day processors are multi-core. That means we can have a separate thread for every core. By using multithreading in our games, we distribute calculations across those threads and take advantage of parallel computation across our cores.
Main thread vs. Background thread
One thread runs our game by default. It is called the main thread. The main thread can spawn background threads (also known as worker threads), which run in parallel and usually synchronize their results with the main thread when finished.
However, computations in games are typically small, so constantly spawning and destorying background threads when they finish can be expensive.
Thread pooling
To solve this issue we can spawn the threads we need and keep them in a pool. When we need to compute something, we pick an available thread from the pool. When it is done, we can use it again. This approach is a design pattern known as thread pooling.
This solution is not always optimal. If the number of threads in our pool is more that the available cores on our user’s system, threads will compete for CPU resources.
Context switching
This competition causes context switching. Whenever we have more threads than cores, we are bound to have a core that is used by as least two threads.
Because a single core can run only one thread at a time, in order to accomodate two, we have to constantly switch between them. Every time we do this, we save the current execution context, load the execution context of the other thread, run it for a while and then repeat the same process. This is known as context switching.
This is a resource-intensive operation and should be avoided unless necessary.
Race conditions
Race conditions are bugs caused by multiple threads accessing the same data. They are resolved by introducing locking mechanisms to our system.
Example: Say we have 2 threads and a single global integer Num
with a value of 10. We want both of our threads to add 1 to Num
. However, we do not have any locking mechanisms in place. This means we have a race condition in our code. In an optimal scenario we would like our threads to increment Num
sequentially. However, the issue arises when both threads read the value of Num
at the same time. Then the threads would separately compute Num += 1
and each end up with a result Num = 11
, instead of 12.
Race conditions are difficult to reproduce and debug, because they depend on timing. Thankfully, the C# Job System has built in tools to detect them.
What is a Job System?
A job systems allows for multithreaded code by creating jobs instead of threads.
It manages a pool of worker threads across many cores. Usually, that is one worker per logical CPU core.
All jobs are put into a job queue. Worker threads then pick jobs from the queue and execute them.
Some jobs might depend on others. The Job System ensures jobs are executed in the right order.
Note: The C# Job System is integrated with Unity’s internal native job system. Your code and Unity’s internal code share the same worker threads. Having a single pool to work with means you will not create an excessive number of threads and cause unnecessary context switching. This is essential for performance.
What is a job?
A job is a small unit of work that performs a specific task. It receives parameters and operates on data. Jobs can depend on other jobs to complete before they run.
The safety system
The C# Job System has the ability to detect all potential race conditions in your code.
Managed vs. Unmanaged memory
Managed memory gets automatically freed up by a garbage collector when there are no references to it.
Unmanaged memory (or native memory) is not garbage collected. Therefore, you must manually deallocate your resources after you are done with them.
Blittable types
Blittable types have the same memory layout in both managed and unmanaged code.
Race condition safety
How does the Job System resolve race conditions between its threads?
Problem: The Job System schedules a job that works with a reference to data in the main thread. The job starts modifying the data while our main thread is reading it. This creates a race condition, because if our job is in the middle of modifying that data and and the main thread reads it, it would be in an invalid state.
Solution: Jobs always work with copies of the original data. That way race conditions are avoided. The Job System copies the data from managed to unmanaged memory. It does so using memcpy
which means it only supports blittable types.
Native Containers
Jobs work with copies of data. So how do we get our result from the job? The answer is native containers. Native containers are a managed value type that contains a pointer to an unmanaged allocation in memory. It allows us to access data shared with the main thread directly, rather than copying it.
With ECS you have access to:
- NativeArray
- NativeList
- NativeHashMap
- NativeMultiHashMap
- NativeQueue
NativeContainers and the safety system
Native containers allow for direct access to main thread data. However, that means they are subject to race conditions. For that reason there is a builtin safety system into all native container types.
The safety system is capable of detecting memory leaks and race conditions.
The DisposeSentinel
class automatically detects memory leaks and reports them to the user.
The AtomicSafetyHandle
provides validation and full safety for read / write permissions to prevent race conditions.
Example: Two jobs write to the same native array. AtomicSafetyHandle
would throw an exception and also give details on how to solve the problem.
If you need two different jobs to write to the same native array, you can do so with a dependency. By making one job dependent on the other, they will execute sequentially.
Note: The safety system allows for multiple jobs to be reading from the same data.
Similarly to Entities.ForEach
parameters, you should mark native containers as read-only if you are not writing to them to improve performance like so:
[ReadOnly]
public NativeArray<int> input;
Allocators
When creating a native container you must specify the memory allocation you need. There are three different types of allocation:
- Allocator.Temp — this is the fastest allocation. Its lifespan is 1 frame. You should not pass containers with this allocation to jobs.
- Allocator.TempJob — this is slower than
Allocator.Temp
. However, it has a lifespan of 4 frames and is thread-safe. Most small jobs use this. - Allocation.Persistent — this is the slowest allocation, but its lifespan is unlimited. It is a wrapper for
malloc
and longer jobs can use it. However, you should not use it when performance is essential.
Creating a native container
NativeArray<int> arr = new NativeArray<int>(1, Allocator.TempJob);
Creating Jobs
To create a parallel job you need to:
- Create a struct that implements
IJob
. - Add member variables (blittable type or native containers).
- Implement the
Execute()
method.
public struct AddJob : IJob
{
public float a;
public float b;
public NativeArray<float> result; public void Execute()
{
result[0] = a + b;
}
}
Scheduling jobs
To schedule a job you must:
- Instantiate the job
- Populate its data
Call the Schedule()
method from the main thread.
var result = new NativeArray<float>(1, Allocator.TempJob);var job = new AddJob();
job.a = 10;
job.b = 20;
job.result = result;JobHandle handle = jobData.Schedule();// Wait for the job to complete
handle.Complete();float aPlusB = result[0];result.Dispose();
Job dependencies
You can make a job dependent on another with a JobHandle
:
JobHandle firstJobHandle = firstJob.Schedule(); secondJob.Schedule(firstJobHandle);
Combining dependencies
You can make a job dependent on multiple others as well:
var deps = new NativeArray<JobHandle>(numJobs, Allocator.TempJob);// Populate deps...JobHandle job = JobHandle.CombineDependencies(deps);
Waiting for jobs in the main thread
You can use job.Complete()
to return ownership of native containers to the main thread. This means the main thread can safely access the native container that the job was using.
Note: Jobs do not execute immediately after you schedule them. Calling Complete()
flushes the memory cache and starts the process of execution. Calling Complete()
on a job also returns ownership of the native containers in its dependecies to the main thread.
Example: Computing a * b + 1
using jobs.
Job code
public struct MultiplyJob : IJob
{
public float a;
public float b;
public NativeArray<float> result; public void Execute()
{
result[0] = a * b;
}
}public struct AddOneJob : IJob
{
public NativeArray<float> result;
public void Execute()
{
result[0] = result[0] + 1;
}
}
Main thread code
var result = new NativeArray<float>(1, Allocator.TempJob);var multiplyJob = new MultiplyJob();
multiplyJob.a = 10;
multiplyJob.b = 20;
multiplyJob.result = result;var multiplyHandle = jobData.Schedule();var addOneJob = new AddOneJob();
addOneJob.result = result;var addOneHandle = addOneJob.Schedule(multiplyHandle);addOneHandle.Complete();float answer = result[0];result.Dispose();
Jobs vs ParallelFor jobs
By scheduling a normal job, it will execute on a single background thread and work in parallel with other jobs. This is what happens when you do .Schedule()
when using Entities.ForEach()
in ECS.
However, you can also schedule a ParallelFor job. The difference is that this type of job will split its workload in batches and execute on multiple background threads in parallel. This is useful for processing large arrays of data and is the equivalent of writing .ScheduleParallel()
when using Entities.ForEach()
. You can think of it as a parallel .map()
.
ParallelFor jobs use a native array as their data source and run across multiple cores, one job per each core. They call Execute()
for each entry in the array.
Example definition
struct IncrementJob: IJobParallelFor
{
public NativeArray<int> values; public void Execute (int index)
{
var temp = values[index];
temp += 1;
values[index] = temp;
}
}
ParallelFor job internals (batching & stealing)
ParallelFor jobs divide their work in batches and distribute them between multiple cores. When you schedule them, you must specify the size of the data source and, after they are split, a separate job is scheduled for each core.
When one of those smaller jobs finishes it steals half of the remaining batches of other jobs. This way you ensure cache locality for each job as well as make the most use of your processor.
You can control the batch size or how many elements you have in a single batch. To determine the correct batch size you should take into consideration the amount of work the job will be doing. For small jobs like adding up vectors batch sizes between 32 and 128 are appropriate. For more expensive jobs a batch size of 1 is fine. Small batch sizes will ensure a more even distribution between worker threads.
Tip: To determine the most appropriate batch size you can start at one and keep increasing until there are negligible performance gains.
ParallelFor job example
This example does something.
Job code
public struct ElementWiseSumJob : IJobParallelFor
{
[ReadOnly]
public NativeArray<float> a;
[ReadOnly]
public NativeArray<float> b;
public NativeArray<float> result; public void Execute(int i)
{
result[i] = a[i] + b[i];
}
}
Main thread code
var a = new NativeArray<int>(2, Allocator.TempJob);var b = new NativeArray<int>(2, Allocator.TempJob);var result = new NativeArray<int>(2, Allocator.TempJob);a[0] = 100;
a[1] = 200;b[0] = 1;
b[1] = 2;var jobData = new ElementWiseSumJob();
jobData.a = a;
jobData.b = b;
jobData.result = result;var batchSize = 1;
var handle = jobData.Schedule(result.Length, batchSize);handle.Complete();a.Dispose();
b.Dispose();
result.Dispose();
ParallelForTransform jobs
This is a type of ParallelFor job that is specifically designed to work with Unity’s Transforms.
C# Job System tips and patterns
Make sure you follow these guidelines when using the C# Job System
Do not access static data from a job
Accessing static data from a job circumvents all safety systems and may lead to crashes.
Flush scheduled batches
Waking up worker threads can be expensive. That is why the job system intentionally delays job execution. You can wake workers up by calling JobHandle.ScheduleBatchedJobs()
. Its best to call this either when you have a lot of scheduled jobs or when you do a lot of work between scheduling jobs (so they execute between schedules).
Note: In ECS the batch is implicitly flushed for you!
Don’t try to update NativeContainer contents
NativeContainers work with structs so you need to modify data in the following way:
MyStruct temp = myNativeArray[i]; // readtemp.memberVariable = 0; // modifymyNativeArray[i] = temp; // write
Call JobHandle.Complete to regain ownership
You must call the method JobHandle.Complete()
to regain ownership of the NativeContainer
types to the main thread. Calling JobHandle.IsCompleted
is not enough, because JobHandle.Complete()
cleans up the memory and otherwise you would introduce a memory leak.
Use Schedule and Complete in the main thread
You can only call Schedule
and Complete
from the main thread.
Use Schedule and Complete at the right time
Call Schedule
on a job as soon as you have the data it needs and don’t call Complete
on it until you need the results.
Mark NativeContainer types as read-only
Use the [ReadOnly]
attribute to improve performance.
Debugging jobs
Jobs have a Run()
function that you can use in place of Schedule
to immediately execute the job on the main thread. You can use this for debugging purposes.
Do not allocate managed memory in jobs
Unity’s Burst compiler is capable of compiling your jobs to noticeably improve their performance. However, it does not work for jobs that allocate managed memory. Moreover, allocating this type of memory is slow so you should avoid doing it in jobs.
Conclusion
This concludes the tutorial on the C# Job System. In the last part of this series we will take a look at the Burst compiler.