Getting started with Unity DOTS — Part 2: C# Job System

Nikolay Karagyozov
10 min readMay 25, 2020

In Part One we discussed the Entities preview package. However, to achieve its performance, DOTS also takes advantage of multithreading. The C# Job System package lets you write multithreaded code to distribute your workload on multiple cores. It also features tools to help you avoid common threading problems. Using it in combination with the Burst compiler can lead to improvements in both performance and battery life (on mobile devices).

Multithreading basics

A CPU core is capable of processing only one instruction at a time. To do that, it needs an execution context called a thread. So, if our game is running on a single thread, it would be processed sequentially.

However, modern day processors are multi-core. That means we can have a separate thread for every core. By using multithreading in our games, we distribute calculations across those threads and take advantage of parallel computation across our cores.

Main thread vs. Background thread

One thread runs our game by default. It is called the main thread. The main thread can spawn background threads (also known as worker threads), which run in parallel and usually synchronize their results with the main thread when finished.

However, computations in games are typically small, so constantly spawning and destorying background threads when they finish can be expensive.

Thread pooling

To solve this issue we can spawn the threads we need and keep them in a pool. When we need to compute something, we pick an available thread from the pool. When it is done, we can use it again. This approach is a design pattern known as thread pooling.

This solution is not always optimal. If the number of threads in our pool is more that the available cores on our user’s system, threads will compete for CPU resources.

Context switching

This competition causes context switching. Whenever we have more threads than cores, we are bound to have a core that is used by as least two threads.

Because a single core can run only one thread at a time, in order to accomodate two, we have to constantly switch between them. Every time we do this, we save the current execution context, load the execution context of the other thread, run it for a while and then repeat the same process. This is known as context switching.

This is a resource-intensive operation and should be avoided unless necessary.

Race conditions

Race conditions are bugs caused by multiple threads accessing the same data. They are resolved by introducing locking mechanisms to our system.

Example: Say we have 2 threads and a single global integer Num with a value of 10. We want both of our threads to add 1 to Num . However, we do not have any locking mechanisms in place. This means we have a race condition in our code. In an optimal scenario we would like our threads to increment Num sequentially. However, the issue arises when both threads read the value of Num at the same time. Then the threads would separately compute Num += 1 and each end up with a result Num = 11, instead of 12.

Race conditions are difficult to reproduce and debug, because they depend on timing. Thankfully, the C# Job System has built in tools to detect them.

What is a Job System?

A job systems allows for multithreaded code by creating jobs instead of threads.

It manages a pool of worker threads across many cores. Usually, that is one worker per logical CPU core.

All jobs are put into a job queue. Worker threads then pick jobs from the queue and execute them.

Some jobs might depend on others. The Job System ensures jobs are executed in the right order.

Note: The C# Job System is integrated with Unity’s internal native job system. Your code and Unity’s internal code share the same worker threads. Having a single pool to work with means you will not create an excessive number of threads and cause unnecessary context switching. This is essential for performance.

What is a job?

A job is a small unit of work that performs a specific task. It receives parameters and operates on data. Jobs can depend on other jobs to complete before they run.

The safety system

The C# Job System has the ability to detect all potential race conditions in your code.

Managed vs. Unmanaged memory

Managed memory gets automatically freed up by a garbage collector when there are no references to it.

Unmanaged memory (or native memory) is not garbage collected. Therefore, you must manually deallocate your resources after you are done with them.

Blittable types

Blittable types have the same memory layout in both managed and unmanaged code.

Race condition safety

How does the Job System resolve race conditions between its threads?

Problem: The Job System schedules a job that works with a reference to data in the main thread. The job starts modifying the data while our main thread is reading it. This creates a race condition, because if our job is in the middle of modifying that data and and the main thread reads it, it would be in an invalid state.

Solution: Jobs always work with copies of the original data. That way race conditions are avoided. The Job System copies the data from managed to unmanaged memory. It does so using memcpy which means it only supports blittable types.

Native Containers

Jobs work with copies of data. So how do we get our result from the job? The answer is native containers. Native containers are a managed value type that contains a pointer to an unmanaged allocation in memory. It allows us to access data shared with the main thread directly, rather than copying it.

With ECS you have access to:

  • NativeArray
  • NativeList
  • NativeHashMap
  • NativeMultiHashMap
  • NativeQueue

NativeContainers and the safety system

Native containers allow for direct access to main thread data. However, that means they are subject to race conditions. For that reason there is a builtin safety system into all native container types.

The safety system is capable of detecting memory leaks and race conditions.

The DisposeSentinel class automatically detects memory leaks and reports them to the user.

The AtomicSafetyHandle provides validation and full safety for read / write permissions to prevent race conditions.

Example: Two jobs write to the same native array. AtomicSafetyHandle would throw an exception and also give details on how to solve the problem.

If you need two different jobs to write to the same native array, you can do so with a dependency. By making one job dependent on the other, they will execute sequentially.

Note: The safety system allows for multiple jobs to be reading from the same data.

Similarly to Entities.ForEach parameters, you should mark native containers as read-only if you are not writing to them to improve performance like so:

[ReadOnly]
public NativeArray<int> input;

Allocators

When creating a native container you must specify the memory allocation you need. There are three different types of allocation:

  • Allocator.Temp — this is the fastest allocation. Its lifespan is 1 frame. You should not pass containers with this allocation to jobs.
  • Allocator.TempJob — this is slower than Allocator.Temp . However, it has a lifespan of 4 frames and is thread-safe. Most small jobs use this.
  • Allocation.Persistent — this is the slowest allocation, but its lifespan is unlimited. It is a wrapper for malloc and longer jobs can use it. However, you should not use it when performance is essential.

Creating a native container

NativeArray<int> arr = new NativeArray<int>(1, Allocator.TempJob);

Creating Jobs

To create a parallel job you need to:

  1. Create a struct that implements IJob .
  2. Add member variables (blittable type or native containers).
  3. Implement the Execute() method.
public struct AddJob : IJob
{
public float a;
public float b;
public NativeArray<float> result;
public void Execute()
{
result[0] = a + b;
}
}

Scheduling jobs

To schedule a job you must:

  1. Instantiate the job
  2. Populate its data

Call the Schedule() method from the main thread.

var result = new NativeArray<float>(1, Allocator.TempJob);var job = new AddJob();
job.a = 10;
job.b = 20;
job.result = result;
JobHandle handle = jobData.Schedule();// Wait for the job to complete
handle.Complete();
float aPlusB = result[0];result.Dispose();

Job dependencies

You can make a job dependent on another with a JobHandle:

JobHandle firstJobHandle = firstJob.Schedule(); secondJob.Schedule(firstJobHandle);

Combining dependencies

You can make a job dependent on multiple others as well:

var deps = new NativeArray<JobHandle>(numJobs, Allocator.TempJob);// Populate deps...JobHandle job = JobHandle.CombineDependencies(deps);

Waiting for jobs in the main thread

You can use job.Complete() to return ownership of native containers to the main thread. This means the main thread can safely access the native container that the job was using.

Note: Jobs do not execute immediately after you schedule them. Calling Complete() flushes the memory cache and starts the process of execution. Calling Complete() on a job also returns ownership of the native containers in its dependecies to the main thread.

Example: Computing a * b + 1 using jobs.

Job code

public struct MultiplyJob : IJob
{
public float a;
public float b;
public NativeArray<float> result;
public void Execute()
{
result[0] = a * b;
}
}
public struct AddOneJob : IJob
{
public NativeArray<float> result;

public void Execute()
{
result[0] = result[0] + 1;
}
}

Main thread code

var result = new NativeArray<float>(1, Allocator.TempJob);var multiplyJob = new MultiplyJob();
multiplyJob.a = 10;
multiplyJob.b = 20;
multiplyJob.result = result;
var multiplyHandle = jobData.Schedule();var addOneJob = new AddOneJob();
addOneJob.result = result;
var addOneHandle = addOneJob.Schedule(multiplyHandle);addOneHandle.Complete();float answer = result[0];result.Dispose();

Jobs vs ParallelFor jobs

By scheduling a normal job, it will execute on a single background thread and work in parallel with other jobs. This is what happens when you do .Schedule() when using Entities.ForEach() in ECS.

However, you can also schedule a ParallelFor job. The difference is that this type of job will split its workload in batches and execute on multiple background threads in parallel. This is useful for processing large arrays of data and is the equivalent of writing .ScheduleParallel() when using Entities.ForEach() . You can think of it as a parallel .map() .

ParallelFor jobs use a native array as their data source and run across multiple cores, one job per each core. They call Execute() for each entry in the array.

Example definition

struct IncrementJob: IJobParallelFor
{
public NativeArray<int> values;
public void Execute (int index)
{
var temp = values[index];
temp += 1;
values[index] = temp;
}
}

ParallelFor job internals (batching & stealing)

ParallelFor jobs divide their work in batches and distribute them between multiple cores. When you schedule them, you must specify the size of the data source and, after they are split, a separate job is scheduled for each core.

When one of those smaller jobs finishes it steals half of the remaining batches of other jobs. This way you ensure cache locality for each job as well as make the most use of your processor.

You can control the batch size or how many elements you have in a single batch. To determine the correct batch size you should take into consideration the amount of work the job will be doing. For small jobs like adding up vectors batch sizes between 32 and 128 are appropriate. For more expensive jobs a batch size of 1 is fine. Small batch sizes will ensure a more even distribution between worker threads.

Tip: To determine the most appropriate batch size you can start at one and keep increasing until there are negligible performance gains.

ParallelFor job example

This example does something.

Job code

public struct ElementWiseSumJob : IJobParallelFor
{
[ReadOnly]
public NativeArray<float> a;
[ReadOnly]
public NativeArray<float> b;
public NativeArray<float> result;
public void Execute(int i)
{
result[i] = a[i] + b[i];
}
}

Main thread code

var a = new NativeArray<int>(2, Allocator.TempJob);var b = new NativeArray<int>(2, Allocator.TempJob);var result = new NativeArray<int>(2, Allocator.TempJob);a[0] = 100;
a[1] = 200;
b[0] = 1;
b[1] = 2;
var jobData = new ElementWiseSumJob();
jobData.a = a;
jobData.b = b;
jobData.result = result;
var batchSize = 1;
var handle = jobData.Schedule(result.Length, batchSize);
handle.Complete();a.Dispose();
b.Dispose();
result.Dispose();

ParallelForTransform jobs

This is a type of ParallelFor job that is specifically designed to work with Unity’s Transforms.

C# Job System tips and patterns

Make sure you follow these guidelines when using the C# Job System

Do not access static data from a job

Accessing static data from a job circumvents all safety systems and may lead to crashes.

Flush scheduled batches

Waking up worker threads can be expensive. That is why the job system intentionally delays job execution. You can wake workers up by calling JobHandle.ScheduleBatchedJobs() . Its best to call this either when you have a lot of scheduled jobs or when you do a lot of work between scheduling jobs (so they execute between schedules).

Note: In ECS the batch is implicitly flushed for you!

Don’t try to update NativeContainer contents

NativeContainers work with structs so you need to modify data in the following way:

MyStruct temp = myNativeArray[i]; // readtemp.memberVariable = 0; // modifymyNativeArray[i] = temp; // write

Call JobHandle.Complete to regain ownership

You must call the method JobHandle.Complete() to regain ownership of the NativeContainer types to the main thread. Calling JobHandle.IsCompleted is not enough, because JobHandle.Complete() cleans up the memory and otherwise you would introduce a memory leak.

Use Schedule and Complete in the main thread

You can only call Schedule and Complete from the main thread.

Use Schedule and Complete at the right time

Call Schedule on a job as soon as you have the data it needs and don’t call Complete on it until you need the results.

Mark NativeContainer types as read-only

Use the [ReadOnly] attribute to improve performance.

Debugging jobs

Jobs have a Run() function that you can use in place of Schedule to immediately execute the job on the main thread. You can use this for debugging purposes.

Do not allocate managed memory in jobs

Unity’s Burst compiler is capable of compiling your jobs to noticeably improve their performance. However, it does not work for jobs that allocate managed memory. Moreover, allocating this type of memory is slow so you should avoid doing it in jobs.

Conclusion

This concludes the tutorial on the C# Job System. In the last part of this series we will take a look at the Burst compiler.

DOTS: Part 1— Entities

DOTS: Part 3 — Burst compiler

Resources

--

--