Inner Loop

In computer programs, an important form of control flow is the loop. For example, this small pseudo-code program uses two nested loops to iterate over all the entries of an n×n matrix, changing their values so that the matrix becomes an identity matrix:

for a in 1..n for b in 1..n if a = b M := 1 else M := 0

However, note that the comparison a = b is made times, which is a lot if is large. We can do better:

for a in 1..n for b in 1..n M := 0 M := 1

At a first glance, one might think that the second variant of the algorithm is slower than the first, since it changes the value of some of the entries twice. But the number of extra changes is only, and the number of comparisons that don't have to be done is ; clearly, for large enough values of, the second algorithm will be faster no matter the relative cost of comparisons and assignments, since we do less work in the innermost loop.

Here's a second example:

for a in 1..10000 do_something_A for b in 1..10000 do_something_B

Assume that do_something_A takes 100 μs to run, and do_something_B takes 1 μs. The entire program then takes μs μs s. We will spend one day optimizing this program, and during that day we can either make do_something_A 50 times faster, or do_something_B 10% faster. Which should we choose? Well, the first option will bring down the total execution time to μs μs s, and the second option will make it μs μs s – clearly, optimizing the innermost loop is the better choice. But what if we could make do_something_A 500 times faster? Or 5000? The answer is still the same, because of those initial 101 seconds, 100 seconds are spent in do_something_B, and just one second in do_something_A. Even if we could make do_something_A take no time at all, making do_something_B 10% faster would still be the better choice!

So: since almost all the program's time is spent in the innermost loops, optimizations there will have a big effect on the total time it takes to run the program. In contrast, optimizing anything but the innermost loops is often a waste of the programmer's time since it speeds up a part of the program that never did take much time.