What is the best way of making my program efficient?

Q

✍: Guest

A

By picking good algorithms, implementing them carefully, and making sure that your program isn't doing any extra work. For example, the most microoptimized character-copying loop in the world will be beat by code which avoids having to copy characters at all.
When worrying about efficiency, it's important to keep several things in perspective. First of all, although efficiency is an enormously popular topic, it is not always as important as people tend to think it is. Most of the code in most programs is not time-critical. When code is not time-critical, it is usually more important that it be written clearly and portably than that it be written maximally efficiently. (Remember that computers are very, very fast, and that seemingly ``inefficient'' code may be quite efficiently compilable, and run without apparent delay.)
It is notoriously difficult to predict what the ``hot spots'' in a program will be. When efficiency is a concern, it is important to use profiling software to determine which parts of the program deserve attention. Often, actual computation time is swamped by peripheral tasks such as I/O and memory allocation, which can be sped up by using buffering and caching techniques.
Even for code that is time-critical, one of the least effective optimization techniques is to fuss with the coding details. Many of the ``efficient coding tricks'' which are frequently suggested are performed automatically by even simpleminded compilers. Heavyhanded optimization attempts can make code so bulky that performance is actually degraded, by increasing the number of page faults or by overflowing instruction caches or pipelines. Furthermore, optimization tricks are rarely portable (i.e. they may speed things up on one machine but slow them down on another). In any case, tweaking the coding usually results in at best linear performance improvements; the big payoffs are in better algorithms.
If the performance of your code is so important that you are willing to invest programming time in source-level optimizations, make sure that you are using the best optimizing compiler you can afford. (Compilers, even mediocre ones, can perform optimizations that are impossible at the source level).
When efficiency is truly important, the best algorithm has been chosen, and even the coding details matter, the following suggestions may be useful. (These are mentioned merely to keep followups down; appearance here does not necessarily constitute endorsement by the author. Note that several of these techniques cut both ways, and may make things worse.)
1. Sprinkle the code liberally with register declarations for oft-used variables; place them in inner blocks, if applicable. (On the other hand, most modern compilers ignore register declarations, on the assumption that they can perform register analysis and assignment better than the programmer can.)
2. Check the algorithm carefully. Exploit symmetries where possible to reduce the number of explicit cases.
3. Examine the control flow: make sure that common cases are checked for first, and handled more easily. If one side of an expression involving && or || will usually determine the outcome, make it the left-hand side, if possible. (
4. Use memcpy instead of memmove, if appropriate
5. Use machine- and vendor-specific routines and #pragmas.
6. Manually place common subexpressions in temporary variables. (Good compilers do this for you.)
7. Move critical, inner-loop code out of functions and into macros or in-line functions (and out of the loop, if invariant). If the termination condition of a loop is a complex but loop-invariant expression, precompute it and place it in a temporary variable. (Good compilers do these for you.)
8. Change recursion to iteration, if possible.
9. Unroll small loops.
10. Discover whether while, for, or do/while loops produce the best code under your compiler, and whether incrementing or decrementing the loop control variable works best.
11. Remove goto statements--some compilers can't optimize as well in their presence.
12. Use pointers rather than array subscripts to step through arrays .
13. Reduce precision. (Using float instead of double may result in faster, single-precision arithmetic under an ANSI compiler, though older compilers convert everything to double, so using float can also be slower.) Replace time-consuming trigonometric and logarithmic functions with your own, tailored to the range and precision you need, and perhaps using table lookup. (Be sure to give your versions different names;
14. Cache or precompute tables of frequently-needed values.
15. Use standard library functions in preference to your own. (Sometimes the compiler inlines or specially optimizes its own functions.) On the other hand, if your program's calling patterns are particularly regular, your own special-purpose implementation may be able to beat the library's general-purpose version. (Again, if you do write your own version, give it a different name.)
16. As a last, last resort, hand-code critical routines in assembly language (or hand-tune the compiler's assembly language output). Use asm directives, if possible.
Here are some things not to worry about:

1. 17x. whether i++ is faster than i = i + 1
2. 18x. whether i << 1 (or i > > 1, or i & 1) is faster than i * 2 (respectively i / 2, i % 2).

It is not the intent here to suggest that efficiency can be completely ignored. Most of the time, however, by simply paying attention to good algorithm choices, implementing them cleanly, and avoiding obviously inefficient blunders (i.e. make sure you don't end up with an O(n**3) implementation of an O(n**2) algorithm), perfectly acceptable results can be achieved.

2015-02-11, 1898👍, 0💬