Software applications are designed to achieve a specified set of functionalities. The performance consideration may add to the design later on.
There are various techniques which can help a software run faster. However, improvement in speed can also cause an increase in the program size. So the approach to improving the performance is to write the code in such a way that both the memory and speeds are optimized. There are different ways to get performance data, but the best way is to use a good performance profiler, which shows the time spent in each function and provide analysis on the data. In this article, we will use an example of an image recognition program and the same can be applied to any other signal processing code.
Once we have the data, the next step is to analyze and identify the routines that consumes more than desired time to execute. These areas are known as “Hotspots.” Before starting optimization, it is better to have a look at the code and choose a better algorithm, if any, rather than optimizing existing code for performance.
Goals of Code Optimization
- Remove redundant code without changing the meaning of the program.
- Reduce execution speed.
- Reduce memory consumption.
A few points to keep in mind before optimizing the code:
- Time based optimization will give faster output, but it can lead up to a bigger code base. So to optimize the code for time performance may actually conflict with memory and size consumption. For that, you have to find the balance between time and memory, in consideration to your requirement.
- Performance optimization is a never-ending process. There are always chances to improve and make your code run faster.
- Sometimes, we can be tempted to use certain programming methods to run faster at the expense of not following best practices like coding standards. Try to avoid any such kind of inappropriate methods.
Category of Optimization
You can optimize your code by either reducing space or time because both space and time optimization categories are correlated with each other.
Space Optimization Techniques
1. Data Types Usage
The use of correct data type is important in a recursive calculation or large array processing. Smaller data types are usually faster. It will improve memory utilization and take lesser CPU cache. This can help us to improve the speed by decreasing number of CPU cycles. Also, when we use smaller data type, it will require less number of CPU cycles for operation.
2. Data Alignment: Arrangement and Packing
The declaration of the component in a structure will determine how the components are being stored. Due to the memory alignment, it is possible to have dummy area within the structure. It is recommended to place all similar sized variables in the same group.
3. Pass by Reference
When we pass a large structure or class to a function, pass by value will make a copy of the arguments. In most cases, this is not required and it may create performance issues. Also, when we pass arguments by value, the only way to return a value back to the caller is via the function’s return value. It is often suitable to use pass by reference and modify the argument to give the value back to caller function.
If we use pass by value, when there are large number of arguments, it may increase processing time and RAM usage due to the number of pushing and popping action on each function call. Especially, when there is a need to pass a large structure or class, it will take a long time.
With pass by reference, it allows us to pass large structures and classes with a minimum performance penalty and RAM usage.
4. Return Value
The return value of a function will be stored in a register. If this return data have no intended usage, time and space are wasted in storing this information. Programmer should define the function as “void” to minimize the extra handling in the function.
Time Optimization Techniques
1. Optimize Program Algorithm
For any code, you should always allocate some time to think the right algorithm to use. So, the first task is to select and improve the algorithm that will be frequently used in the code.
2. Avoid Type Conversion
Whenever possible, plan to use the same type of variables for processing. Type conversion must be avoided. Otherwise, extra machine cycles will be wasted to convert from one type to another.
3. Loop Related Optimization If you identify that a loop is executed thousands of cycles in your code and is taking most of execution time, the best thing is to redesign code for lesser loop execution count. This is more effective than making the loop run faster. (Click here to read the full white paper for examples of various techniques that can be used to optimize the loop.)
4. Use Lookup Table Look-Up-Table (LUT) is an array that holds a set of pre-computed results for a given operation. It provides access to the results in a way that is faster than computing each time the result of the given operation. LUTs are a tool to accelerate operations that can be expressed as functions of an integer argument. They store pre-computed results that allow the program to immediately obtain a result without performing same time-consuming operation repeatedly.
However, we have to be careful before deciding to use LUT’s to solve a particular problem. We should carefully evaluate the cost associated with their use (in particular, the memory space that will be required to store the precomputed results).
To read detailed examples of code optimization and execution time comparisons for optimized code versus the original code, click here to download our complete whitepaper.
Rudrik Upadhyay is a Senior Embedded Engineer, Julsi Nagarbandhara is an Embedded Engineer, and Samir Bhatt is a Senior Technical Lead at eInfochips.