Embedded Systems/Locks and Critical Sections

Embedded Systems

An important part of an RTOS is the lock mechanisms, and the Critical Section (CS) implementation. This section will talk about some of the problems involved in creating these mechanisms.

Basic Critical Sections

Wikipedia has related information at critical section

Most embedded systems have at least one data structure that is written by one task and read by another task. With a preemptive scheduler, it is all too easy to write software that *seems* to work fine most of the time, but occasionally the writer will be interrupted right in the middle of updating the data structure, the RTOS switches to the reader task, and then the reader chokes on the inconsistent data.

We need some way of arranging things so that a writer's modification appears "atomic"—a reader always sees only the (consistent) old version, or the (consistent) new version, never some partially-modified inconsistent state.

There are a variety of ways to avoid this problem, including:

Design the data structure so that the writer can update it in such a way that it is always in a consistent state. This requires hardware that supports atomic primitives powerful enough to update the data structure from one consistent state to the next consistent state in one atomic operation. Wikipedia:lock-free and wait-free algorithms. For example, the read twice and compare algorithm we discuss elsewhere.
Have the writer turn off the task scheduler while it is updating the data structure. Then the only time the reader could possibly see the data structure, the data structure is in a consistent state.
Have the writer turn off all interrupts (including the timer interrupt that kicks off the task scheduler) while it is updating the data structure. Then the only time the reader could possibly see the data structure, the data structure is in a consistent state. But this makes interrupt latency much worse.
Use a "lock" associated with each data structure. When the reader sees that the writer is updating the data structure, have the reader tell the task scheduler to run some other process until the writer is finished. (There are many kinds of locks).
Use a "monitor" associated with every routine that uses a data structure.

Whenever a lock or CS mechanism is called, it is important that the RTOS disable the scheduler, to prevent the atomic operations from being preempted and executed incorrectly. Remember that embedded systems need to be stable and robust, so we cannot risk having the operating system itself being preempted while it is trying to create a lock or critical section. If we have a function called DisableScheduler( ), we can call that function to disable the scheduler before any atomic operation is attempted, and we can then have a function called EnableScheduler( ) to restore the scheduler, and continue with normal operation.

Let us now create a general function for entering a critical section:

 EnterCS()
 {
    DisableScheduler();
    return;
 }

and one for exiting a critical section:

 ExitCS()
 {
    EnableScheduler();
    return;
 }

By disabling the scheduler during the critical section, we have guaranteed that no preemptive task-switching will occur during the critical section.

This method has a disadvantage that it slows down the system, and prevents other time-sensitive tasks from running. Next, we will show a method by which critical sections can be implemented to allow for preemption.

Critical Section Objects

Critical Sections, like any other term in computing can have a different definition than simply an operation that prevents preemption. For instance, many systems define a CS as being an object that prevents multiple tasks from entering a given section of code. Let us say that we are implementing a version of malloc( ) on our system. We want to ensure that once a memory allocation attempt has started, that no other memory allocation attempts can begin. Only 1 memory allocation attempt can be happening at one time. However, we want to allow for the malloc function to be preempted just like every other function. To implement this, we need a new data object called a CRITICAL_SECTION, or CRIT_X, or something of that nature. Our malloc function will now look like this:

 CRIT_SECT mallocCS; //a global CS variable, for use in all tasks.
  
 int RTOS_main(void) //we register our CS in the beginning of the RTOS main routine
 {
    AllocCS(mallocCS);  //register our critical section with the OS, to prevent duplicates
    ...
  
 void *malloc(size_t size)
 {
    void *ptr;
    EnterCS(mallocCS); //we enter the CS, and no other instance of malloc can enter it.
    ptr = FindFreeMemory(size);
    ExitCS(mallocCS);  //other malloc attempts can now proceed
    return ptr;
 }

If two tasks call malloc at nearly the same time, the first one will enter the critical section, while the second one will wait, or be "blocked" at the EnterCS routine. When the first malloc is finished, the second malloc's EnterCS function will return, and the function will continue.

To allow other processes looking at other data structures to continue even though this data structure has been locked, EnterCS() is typically redefined something like:

  // non-blocking attempt to enter critical section
  int TryEnterCS( CRIT_SECT this )
  {
    int success = 0;
    DisableScheduler();
        if( this->lock == 0 ){
            this->lock = 1; // mark structure as locked
            success = 1;
        };
    EnableScheduler();
    return success;
  }
  
  // blocking attempt to enter critical section
  EnterCS( CRIT_SECT this ){
    int success = 0;
    do{
        success = TryEnterCS( this->lock );
        if( !success ){ Yield(); }// tell scheduler to run some other task for a while.
    }while( !success );
    return;
  }
  
  // release lock
  ExitCS( CRIT_SECT this )
  {
    ASSERT( 1 == this->lock );
    this->lock = 0;
    return;
  }

The value to creating a Critical Section object, and using that to prevent preemption of sensitive areas is that this scheme does not slow down the system, the way the first scheme does (by disabling the scheduler, and preventing other tasks from executing).

Wikipedia has related information at serializing tokens

Some OSes, such as Dragonfly BSD, implement EnterCS() and ExitCS() using "serializing tokens", in such a way that when a process attempts to get a lock on another data structure, the OS briefly releases all locks that process holds, before giving that process a lock on all requested locks.

Mutexes

This page of the Embedded Systems book is a stub. You can help by expanding this section.