Embedded Systems/RTOS Implementation

Embedded Systems

The chapters in this section will discuss some of the general concepts involved in writing your own Real-Time Operating System. Readers may be able to read and understand the material in these pages without prior knowledge in operating system design and implementation, but a background knowledge in those subjects would certainly be helpful.

Memory Management

An important point to remember is that some embedded systems are locked away and expected to run for years on end without being rebooted. If we use conventional memory-management schemes to control memory allocation, we can end up with fragmented memory which can take valuable time to defragment and really is a major problem for tasks that are time-sensitive. This page then will talk about how to implement a memory management scheme in an RTOS, and will talk through to a basic implementation of malloc( ) and free( ).

There are a variety of ways to deal with memory:

Some systems never do a malloc() or free() -- all memory is allocated at compile time.
Some systems use malloc() and free() with manual garbage collection.
- With manual garbage collection, it's possible to fragment memory so badly that one day the system locks up because there is no one piece of memory large enough for a reasonably-large malloc() request, although the total size of the many free pieces of memory add up to far more than that the requested malloc(), which would be bad.
- A small, almost unnoticeable bug could slowly leak memory until the system ran out of memory and locked up, which would be bad.
Some early automatic garbage collection schemes did a "stop the world" for several seconds during garbage collection and/or memory defragmentation. Such a system could miss real-time deadlines, which would be bad.
Some later automatic garbage collection schemes do "incremental" garbage collection and memory defragmentation.
Many real-time systems allocate memory one block at a time from a pool of fixed-size memory blocks. This entirely eliminates external fragmentation.

What is a Task

Embedded systems have a microprocessor connected to some piece of hardware (LEDs, buttons, limit switches, motors, serial port(s), battery chargers, etc.).

Each piece of hardware is generally associated with a little bit of software, called a "task". For example, "Check the keyboard and figure out which (if any) key has been pressed since the last check". Or "Check the current position of the spindle, and update the PID".

Often a task has a real-time limits, such as

the motors must be shut off within 1/10 second after hitting the limit switch to avoid permanent damage
the PID loop must be updated at least every 1/100 second to avoid oscillation
the MP3 player must decode a new sample at 44.1 kHz—no faster, or it sounds chipmunk-like—no slower, or it sounds like it's underwater.

Some embedded systems have only one task.

Other embedded systems have a single microcontroller connected to many different pieces of hardware—they need to "multi-task".

task communication and synchronization

Most RTOSes have some way to allow tasks to communicate and synchronize with each other, usually one or more of:

message passing (often highest-priority-message first, rather than first-in first-out)
event flags
mutual exclusion mechanisms: serializing tokens, mutex, semaphore, monitor, etc.

What is the Scheduler

The "task scheduler" (or often "scheduler") is the part of the software that schedules (chooses) which task to run next.

The scheduler is arguably the most difficult component of an RTOS to implement. Schedulers maintain a table of the current state of each task on the system, as well as the current priority of each task. The scheduler needs to manage the timer too.

In general, there are 3 states that a task can be in:

Active. There can be only 1 active thread on a given processor at a time.
Ready. This task is ready to execute, but is not currently executing.
Blocked. This task is currently waiting on a lock or a critical section to become free.

Some systems even allow for other states:

Sleeping. The task has voluntarily given up control for a certain period of time.
Low-Priority. This task only runs when all other tasks are blocked or sleeping.

There are 2 ways the scheduler is called:

the current task voluntarily yield()s to the scheduler, calling the scheduler directly, or
the current task has run "long enough", the timer hardware interrupts it, and the timer interrupt routine calls the scheduler.

The scheduler must save the current status of the current task (save the contents of all registers to memory associated with that task), it must look through the list of tasks to find the highest priority task in the Ready state, and then must switch control back to that task (by restoring all register values from memory associated with the new task).

The scheduler should first check to ensure that it is enabled. If the scheduler is disabled, it shouldn't preempt the current thread. This can be accomplished by checking a current global flag value. Some functions will want to disable the scheduler, so this flag should be accessible by some accessor method. An alternate method to maintaining a global flag is simply to say that any function that wants to disable the scheduler can simply disable the timer. This way the scheduler never gets called, and never has to check any flags.

A few RTOSes disable the process scheduler during the entire duration of system calls—to avoid missing deadlines, this requires that system calls finish very quickly. Other RTOSes have preempt-able system calls; they only disable the process scheduler and other interrupts for extremely short "critical sections".

We will talk more about those functions that need to (briefly) disable the scheduler, inside their critical sections, in the next chapter, Locks and Critical Sections.

Another book Operating System Design/Scheduling Processes/Preemption goes into more detail on various scheduler algorithms.

Interrupts

One main difference between an RTOS and other operating systems is that a RTOS attempts to minimize interrupt latency—the response time to external hardware. This requires minimizing the amount of time that interrupts (including the timer interrupt) are disabled. Some RTOS vendors publish worst-case interrupt disable times.