Embedded Systems/Microprocessor Architectures

< Embedded Systems

The chapters in this section will discuss some of the basics in microprocessor architecture. They will discuss how many features of a microprocessor are implemented, and will attempt to point out some of the pitfalls (speed decreases and bottlenecks, specifically) that each feature represents to the system.


Memory BusEdit

In a computer, a processor is connected to the RAM by a data bus. The data bus is a series of wires running in parallel to each other that can send data to the memory, and read data back from the memory. In addition, the processor must send the address of the memory to be accessed to the RAM module, so that the correct information can be manipulated.

Multiplexed Address/Data BusEdit

In old microprocessors, and in some low-end versions today, the memory bus is a single bus that will carry both the address of the data to be accessed, and then will carry the value of the data. Putting both signals on the same bus, at different times is a technique known as "time division multiplexing", or just multiplexing for short. The effect of a multiplexed memory bus is that reading or writing to memory actually takes twice as long: half the time to send the address to the RAM module, and half the time to access the data at that address. This means that on a multiplexed bus, moving data to and from the memory is a very expensive (in terms of time) process, and therefore memory read/write operations should be minimized. It also makes it important to ensure algorithms which work on large datasets are cache efficient.

Demultiplexed BusEdit

The opposite of a multiplexed bus is a demultiplexed bus. A demultiplexed bus has the address on one set of wires, and the data on another set. This scheme is twice as fast as a multiplexed system, and therefore memory read/write operations can occur much faster.

Bus SpeedEdit

In modern high speed microprocessors, the internal CPU clock may move much faster than the clock that synchronizes the rest of the microprocessor system. This means that operations that need to access resources outside the processor (the RAM for instance) are restricted to the speed of the bus, and cannot go as fast as possible. In these situations, microprocessors have 2 options: They can wait for the memory access to complete (slow), or they can perform other tasks while they are waiting for the memory access to complete (faster). Old microprocessors and low-end microprocessors will always take the first option (so again, limit the number of memory access operations), while newer, and high-end microprocessors will often take the second option.

I/O BusEdit

Any computer, be it a large PC or a small embedded computer, is useless if it has no means to interact with the outside world. I/O communications for an embedded computer frequently happen over a bus called the I/O Bus. Like the memory bus, the I/O bus frequently multiplexes the input and output signals over the same bus. Also, the I/O bus is moving at a slower speed than the processor is, so a large numbers of I/O operations can cause a severe performance bottleneck. It is very important for measurable purposes for the entire system. It is not uncommon for different IO methods to have separate buses. Unfortunately, it is also not uncommon for the electrical engineers designing the hardware to cheat and use a bus for more than 1 purpose. Doing so can save the need for extra transistors in the layout, and save cost. For example, a project may use the USB bus to talk to some LEDs that are physically close by. These different devices may have very different speeds of communication. When programming IO bus control, make sure to take this into account.

In some systems, memory mapped IO is used. In this scheme, the hardware reads its IO from predefined memory addresses instead of over a special bus. This means you'll have simpler software, but it also means main memory will get more access requests.

Programming the IO BusEdit

When programming IO bus controls, there are 5 major variations on how to handle it- the main thread poll, the multithread poll, the interrupt method, the interrupt+thread method, and using a DMA controller.

Main thread pollEdit

In this method, whenever you have output ready to be sent, you check if the bus is free and send it. Depending on how the bus works, sending it can take a large amount of time, during which you may not be able to do anything else. Input works similarly- every so often you check the bus to see if input exists.


  • Simple to understand


  • Very inefficient, especially if you need to push the data manually over the bus (instead of via DMA)
  • If you need to push data manually, you are not doing anything else, which may lead to problem with real time hardware
  • Depending on polling frequency and input frequency, you could lose data by not handling it fast enough

In general, this system should only be used if IO only occurs at infrequent intervals, or if you can put it off when there are more important things to do. If your system supports multithreading or interrupts, you should use other techniques instead.

Multithread pollingEdit

In this method, we spawn off a special thread to poll. If there is no IO when it polls, it puts itself back to sleep for a predefined amount of time. If there is IO, it deals with it on the IO thread, allowing the main thread to do whatever is needed.


  • Does not put off the main thread
  • Allows you to define the importance of IO by changing the priority of the thread


  • Still somewhat inefficient
  • If IO occurs frequently, your polling interval may be too small for you to sleep sufficiently, starving other threads
  • If your thread is too low in priority or there are too many threads for the OS to wake the thread in a timely fashion, data can be lost.
  • Requires an OS capable of threading

This technique is good if your system supports threading, but does not support interrupts or has run out of interrupts. It does not work well when frequent IO is expected- the OS may not properly sleep the thread if the interval is too small, and you will be adding the overhead of 2 context switches per poll.

Interrupt architectureEdit

(The interrupt architecture uses interrupts, which we discuss in more detail in chapter Embedded Systems/Interrupts).

In this method, the bus fires off an interrupt to the processor whenever IO is ready. The processor then jumps to a special function, dropping whatever else it was doing. The special function (called an interrupt handler, or interrupt service routine) takes care of all IO, then goes back to whatever it was doing.


  • Very efficient
  • Very simple, requires only 1 function


  • If dealing with IO takes a long time, you can starve other things. This is especially dangerous if your handler masks interrupts, which can cause you to miss hardware interrupts from real time hardware.
  • If your handler takes so long that more input is ready before you handle existing input, data can be lost.

This technique is great as long as dealing with the IO is a short process, such as when you just need to set up DMA. If its a long process, use multithreaded polling or interrupts with threads.

Interrupts and threadsEdit

We discuss this technique in more detail in Embedded Systems/Interrupts

In this technique, you use an interrupt to detect when IO is ready. Instead of dealing with the IO directly, the interrupt signals a thread that IO is ready and lets that thread deal with it. Signalling the thread is usually done via semaphore- the semaphore is initialized to the taken state. The IO thread tries to take the semaphore, which fails and the OS puts it to sleep. When IO is ready, the interrupt is fired and releases the semaphore. The thread then wakes up, and handles the IO before trying to take the semaphore and being put back to sleep.

The routine the interrupt vector points at is the "first level interrupt handler". The thread that the OS later wakes up to handle the rest of the work is the "second level interrupt handler".


  • minimum latency -- instead of all other interrupts being disabled until that interrupt is completely handled, interrupts are turned back on (at the end of the first level interrupt handler) as soon as possible.
  • Does not put off the main thread
  • Allows you to define the importance of IO by changing the priority of the thread
  • Very efficient- only makes context changes when needed and does not poll.
  • Very clean solution architecturally, allows you to be very flexible in how you handle IO.
  • The second level interrupt handler can wait for a lock to be released (Embedded Systems/Locks and Critical Sections).


  • Requires an OS capable of threading
  • Most complex solution

This solution is the most flexible, and one of the most efficient. It also minimizes the risk of starving more important tasks. Its probably the most common method used today.

DMA (Direct Memory Access) ControllerEdit

In some specialised situations, such as where a set of data must be transferred to a communications IO device, a DMA controller may be present that can automatically detect when the IO device is ready for more data, and transfer that data. This technique may be used in conjunction with many of the other techniques, for instance an interrupt may be used when the data transfer is complete.


  • This provides the best performance, since the I/O can happen in parallel with other code execution


  • Only applicable to a limited range of problems
  • Not all systems have DMA controllers. This is especially true of the more basic 8-bit microcontrollers.
  • Parallel nature may complicate a system