The kernel's role

The house analogy is excellent for getting across the concept of synchronization, but it falls down in one major area. In our house, we had many threads running simultaneously. However, in a real live system, there's typically only one CPU, so only one thing can run at once.

Single CPU

Let's look at what happens in the real world, and specifically, the economy case where we have one CPU in the system. In this case, since there's only one CPU present, only one thread can run at any given point in time. The kernel decides, using a number of rules, which thread to run, and runs it.

Multiple CPU (SMP)

If you buy a system that has multiple, identical CPUs all sharing memory and devices, you have an SMP box (SMP stands for Symmetrical Multi Processor, with the symmetrical part indicating that all the CPUs in the system are identical). In this case, the number of threads that can run concurrently (simultaneously) is limited by the number of CPUs. (In reality, this was the case with the single-processor box too!) Since each processor can execute only one thread at a time, with multiple processors, multiple threads can execute simultaneously.

Let's ignore the number of CPUs present for now—a useful abstraction is to design the system as if multiple threads really were running simultaneously, even if that's not the case. In Things to watch out for when using SMP, we'll see some of the non-intuitive impacts of SMP.

The kernel as arbiter

So who decides which thread is going to run at any given instant in time? That's the kernel's job. The kernel determines which thread should be using the CPU at a particular moment, and switches context to that thread. Let's examine what the kernel does with the CPU.

The CPU has a number of registers (the exact number depends on the processor family, for example, x86 versus ARM, and the specific family member, for example, 80486 versus Pentium). When the thread is running, information is stored in those registers (for example, the current program location).

When the kernel decides that another thread should run, it needs to:

  1. Save the currently running thread's registers and other context information
  2. Load the new thread's registers and context into the CPU

But how does the kernel decide that another thread should run? It looks at whether or not a particular thread is capable of using the CPU at this point. When we talked about mutexes, for example, we introduced a blocking state (this occurred when one thread owned the mutex, and another thread wanted to acquire it as well; the second thread would be blocked).

From the kernel's perspective, therefore, we have one thread that can consume CPU, and one that can't, because it's blocked, waiting for a mutex. In this case, the kernel lets the thread that can run consume CPU, and puts the other thread into an internal list (so that the kernel can track its request for the mutex).

Obviously, that's not a very interesting situation. Suppose that a number of threads can use the CPU. Remember that we delegated access to the mutex based on priority and length of wait? The kernel uses a similar scheme to determine which thread is going to run next. There are two factors: priority and scheduling policy, evaluated in that order.

Prioritization

Consider two threads capable of using the CPU. If these threads have different priorities, then the answer is really quite simple; the kernel gives the CPU to the highest priority thread. BlackBerry 10 OS's priorities go from one (the lowest usable) and up, as we mentioned when we talked about obtaining mutexes. Note that priority zero is reserved for the idle thread—you can't use it. (If you want to know the minimum and maximum values for your system, use the functions sched_get_priority_min() and sched_get_priority_max() —they're prototyped in <sched.h>. We'll assume one as the lowest usable, and 255 as the highest.)

If another thread with a higher priority suddenly becomes able to use the CPU, the kernel immediately context-switches to the higher priority thread. We call this preemption; the higher-priority thread preempted the lower-priority thread. When the higher-priority thread is done, and the kernel context-switches back to the lower-priority thread that was running before, we call this resumption; the kernel resumes running the previous thread.

Now, suppose that two threads are capable of using the CPU and have the exact same priority.

Scheduling policies

Let's assume that one of the threads is currently using the CPU. We'll examine the rules that the kernel uses to decide when to context-switch in this case. (Of course, this entire discussion really applies only to threads at the same priority — the instant that a higher-priority thread is ready to use the CPU it gets it; that's the whole point of having priorities in a realtime operating system.)

The two main scheduling policies (policies) that the QNX Neutrino microkernel understands are Round Robin (or just RR) and FIFO (First-In, First-Out). There's also sporadic scheduling, see Sporadic scheduling.

FIFO

In the FIFO scheduling policy, a thread is allowed to consume CPU for as long as it wants. This means that if that thread is doing a very long mathematical calculation, and no other thread of a higher priority is ready, that thread could potentially run forever. What about threads of the same priority? They're locked out as well. (It should be obvious at this point that threads of a lower priority are locked out too.)

If the running thread quits or voluntarily gives up the CPU, then the kernel looks for other threads at the same priority that are capable of using the CPU. If there are no such threads, then the kernel looks for lower-priority threads capable of using the CPU. Note that the term voluntarily gives up the CPU can mean one of two things. If the thread goes to sleep, or blocks on a semaphore, and so on, then yes, a lower-priority thread could run (as described above). But there's also a special call, sched_yield() (based on the kernel call SchedYield() ), which gives up CPU only to another thread of the same priority—a lower-priority thread would never be given a chance to run if a higher-priority was ready to run. If a thread does in fact call sched_yield(), and no other thread at the same priority is ready to run, the original thread continues running. Effectively, sched_yield() is used to give another thread of the same priority a crack at the CPU.

In the following diagram, we see three threads operating in two different processes:

Diagram showing three threads operating in two different processes.

If we assume that threads A and B are READY, and that thread C is blocked (perhaps waiting for a mutex), and that thread D (not shown) is currently executing, then this is what a portion of the READY queue that the QNX Neutrino microkernel maintains looks like:

Diagram showing the READY queue.

This shows the kernel's internal READY queue, which the kernel uses to decide who to schedule next. Note that thread C is not on the READY queue, because it's blocked, and thread D isn't on the READY queue either because it's running.

Round robin

The round robin (RR) scheduling policy is identical to FIFO, except that the thread does not run forever if there's another thread at the same priority. It runs only for a system-defined timeslice whose value you can determine by using the function sched_rr_get_interval(). The timeslice is usually 4 ms, but it's actually 4 times the ticksize, which you can query or set with ClockPeriod().

What happens is that the kernel starts an RR thread, and notes the time. If the RR thread is running for a while, the time allotted to it is up (the timeslice expires). The kernel looks to see if there is another thread at the same priority that's ready. If there is, the kernel runs it. If not, then the kernel continues running the RR thread (that is, the kernel grants the thread another timeslice).

The rules

Let's summarize the scheduling rules (for a single CPU), in order of importance:

  • Only one thread can run at a time.
  • The highest-priority ready thread runs.
  • A thread runs until it blocks or exits.
  • An RR thread runs for its timeslice, and then the kernel reschedules it (if required).

The following flowchart shows the decisions that the kernel makes:

Diagram showing the scheduling roadmap.

For a multiple-CPU system, the rules are the same, except that multiple CPUs can run multiple threads concurrently. The order that the threads run (that is, which threads get to run on the multiple CPUs) is determined in the exact same way as with a single CPU — the highest-priority READY thread runs on a CPU. For lower-priority or longer-waiting threads, the kernel has some flexibility as to when to schedule them to avoid inefficiency in the use of the cache. For more information about SMP, see Multicore Processing.

Kernel states

We've been talking about running, ready, and blocked loosely—let's now formalize these thread states.

RUNNING
BlackBerry 10 OS's RUNNING state means that the thread is now actively consuming the CPU. On an SMP system, there are multiple threads running; on a single-processor system, there is one thread running.
READY
The READY state means that this thread could run right now—except that it's not, because another thread, (at the same or higher priority), is running. If two threads were capable of using the CPU, one thread at priority 10 and one thread at priority 7, the priority 10 thread would be RUNNING, and the priority 7 thread would be READY.
The blocked states
What do we call the blocked state? The problem is, there's not just one blocked state. Under BlackBerry 10 OS, there are in fact over a dozen blocking states.

Why so many? Because the kernel keeps track of why a thread is blocked.

We saw two blocking states already—when a thread is blocked waiting for a mutex, the thread is in the MUTEX state. When a thread is blocked waiting for a semaphore, it's in the SEM state. These states indicate which queue (and which resource) the thread is blocked on.

If a number of threads are blocked on a mutex (in the MUTEX blocked state), they get no attention from the kernel until the thread that owns the mutex releases it. At that point one of the blocked threads is made READY, and the kernel makes a rescheduling decision (if required).

Why if required? The thread that just released the mutex could very well still have other things to do and have a higher priority than that of the waiting threads. In this case, we go to the second rule, which states, The highest-priority ready thread runs, meaning that the scheduling order has not changed—the higher-priority thread continues to run.

Kernel states, the complete list

Here's the complete list of kernel blocking states, with brief explanations of each state. This list is available in <sys/neutrino.h>— notice that the states are all prefixed with STATE_ (for example, READY in this table is listed in the header file as STATE_READY):

If the state is: The thread is:
CONDVAR Waiting for a condition variable to be signaled.
DEAD Dead. Kernel is waiting to release the thread's resources.
INTR Waiting for an interrupt.
JOIN Waiting for the completion of another thread.
MUTEX Waiting to acquire a mutex.
NANOSLEEP Sleeping for a period of time.
NET_REPLY Waiting for a reply to be delivered across the network.
NET_SEND Waiting for a pulse or message to be delivered across the network.
READY Not running on a CPU, but is ready to run (one or more higher or equal priority threads are running).
RECEIVE Waiting for a client to send a message.
REPLY Waiting for a server to reply to a message.
RUNNING Actively running on a CPU.
SEM Waiting to acquire a semaphore.
SEND Waiting for a server to receive a message.
SIGSUSPEND Waiting for a signal.
SIGWAITINFO Waiting for a signal.
STACK Waiting for more stack to be allocated.
STOPPED Suspended (SIGSTOP signal).
WAITCTX Waiting for a register context (usually floating point) to become available (only on SMP systems).
WAITPAGE Waiting for process manager to resolve a fault on a page.
WAITTHREAD Waiting for a thread to be created.

The important thing to keep in mind is that when a thread is blocked, regardless of which state it's blocked in, it consumes no CPU. Conversely, the only state in which a thread consumes CPU is in the RUNNING state.

We see the SEND, RECEIVE, and REPLY blocked states in Message passing. The NANOSLEEP state is used with functions like sleep() . For more information, see Clocks, timers, and getting a kick every so often. The INTR state is used with InterruptWait() .