Optimizing Software in C++ by Agner Fog - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

17 Optimization in embedded systems

 

Microcontrollers used in small embedded applications have less computing resources than standard PCs. The clock frequency may be a hundred or even a thousand times lower; and the amount of RAM memory may even be a million times less than in a PC. Nevertheless, it is possible to make software that runs quite fast on such small devices if you avoid the large graphics frameworks, interpreters, just-in-time compilers, system database, and other extra software layers and frameworks typically used on bigger systems.

The smaller the system, the more important it is to choose a software framework that uses few resources. On the smallest devices, you don't even have an operating system.

The best performance is obtained by choosing a programming language that can be cross- compiled on a PC and then transferred as machine code to the device. Any language that requires compilation or interpretation on the device itself is a big waste of resources. For these reasons, the preferred language will often be C or C++. Critical device drivers may need assembly language.

C++ takes only slightly more resources than C if you follow the guidelines below. You may choose either C or C++ based on what is most appropriate for the desired program structure.

It is important to economize the use of RAM memory. Big arrays should be declared inside the function they are used in so that they are deallocated when the function returns. Alternatively, you may reuse the same array for multiple purposes.

All dynamic memory allocation using new/delete or malloc/free should be avoided because of the large overhead of managing a memory heap. The heap manager has a garbage collector which is likely to consume time at unpredictable intervals which may interfere with real time applications.

Remember that container classes in the STL (Standard Template Library) and other container class libraries use dynamic memory allocation with new and delete, and often excessively so. These containers should definitely be avoided unless you have ample resources. For example, a FIFO queue should be implemented as a circular buffer with fixed size to avoid dynamic memory allocation. Do not use a linked list (see page 93).

All common implementations of string classes use dynamic memory allocation. You should avoid these and handle text strings in the old fashioned C style as character arrays. Note that the C style string functions have no check for overflow of the arrays. It is the responsibility of the programmer to make sure the arrays are sufficiently large to handle the strings including the terminating zero and to make overflow checks where necessary (see page 96).

Virtual functions in C++ take more resources than non-virtual functions. Avoid virtual functions if possible.

Smaller microprocessors have no native floating point execution units. Any floating point operation on such processors requires a big floating point library which is very time consuming. Therefore, you should avoid any use of floating point expressions. For example, a = b * 2.5 may be changed to  a = b * 5 / 2 (be aware of possible overflow on the intermediate expression  b * 5). As soon as you have even a single constant with a decimal point in your program, you will be loading the entire floating point library. If you want a number to be calculated with two decimals, for example, you should multiply it by 100 so that it can be represented as an integer.

Integer variables can be 8, 16 or 32 bits (rarely 64). You may save RAM space, if necessary, by using the smallest integer size that doesn't cause overflow in the particular application. The integer size is not standardized across platforms. See the compiler documentation for the size of each integer type.

Interrupt service routines and device drivers are particularly critical because they can block the execution of everything else. This normally belongs to the area of system programming, but in applications without an operating system this is the job of the application programmer. There is a higher risk that the programmer forgets that the system code is critical when there is no operating system, and therefore the system code is not separated from the application code. An interrupt service routine should do as little work as possible. Typically it should save one unit of received data in a static buffer or send data from a buffer. It should never respond to a command or do other input/output than the specific event it is servicing. A command received by an interrupt should preferably be responded to at a lower priority level, typically in a message loop in the main program. See page 150 for further discussion of system code.

In this chapter, I have described some of the considerations that are particularly important on small devices with limited resources. Most of the advice in the rest of the present manual is also relevant to small devices, but there are some differences due to the design of small microcontrollers:

  • Smaller microcontrollers have no branch prediction (see p. 44). There is no need to take branch prediction into account in the software.
  • Smaller microcontrollers have no cache (see p. 87). There is no need to organize data to optimize caching.
  • Smaller microcontrollers have no out-of-order execution. There is no need to break down dependency chains (see p. 22).