Computer Performance - Performance Equation

Performance Equation

The total amount of time (t) required to execute a particular benchmark program is

, or equivalently

where

  • P = 1/t is "the performance" in terms of time-to-execute
  • N is the number of instructions actually executed (the instruction path length). The code density of the instruction set strongly affects N. The value of N can either be determined exactly by using an instruction set simulator (if available) or by estimation—itself based partly on estimated or actual frequency distribution of input variables and by examining generated machine code from an HLL compiler. It cannot be determined from the number of lines of HLL source code. N is not affected by other processes running on the same processor. The significant point here is that hardware normally does not keep track of (or at least make easily available) a value of N for executed programs. The value can therefore only be accurately determined by instruction set simulation, which is rarely practiced.
  • f is the clock frequency in cycles per second.
  • C= is the average cycles per instruction (CPI) for this benchmark.
  • I= is the average instructions per cycle (IPC) for this benchmark.

Even on one machine, a different compiler or the same compiler with different compiler optimization switches can change N and CPI—the benchmark executes faster if the new compiler can improve N or C without making the other worse, but often there is a trade-off between them—is it better, for example, to use a few complicated instructions that take a long time to execute, or to use instructions that execute very quickly, although it takes more of them to execute the benchmark?

A CPU designer is often required to implement a particular instruction set, and so cannot change N. Sometimes a designer focuses on improving performance by making significant improvements in f (with techniques such as deeper pipelines and faster caches), while (hopefully) not sacrificing too much C—leading to a speed-demon CPU design. Sometimes a designer focuses on improving performance by making significant improvements in CPI (with techniques such as out-of-order execution, superscalar CPUs, larger caches, caches with improved hit rates, improved branch prediction, speculative execution, etc.), while (hopefully) not sacrificing too much clock frequency—leading to a brainiac CPU design. For a given instruction set (and therefore fixed N) and semiconductor process, the maximum single-thread performance (1/t) requires a balance between brainiac techniques and speedracer techniques.

Read more about this topic:  Computer Performance

Famous quotes containing the words performance and/or equation:

    The honor my country shall never be stained by an apology from me for the statement of truth and the performance of duty; nor can I give any explanation of my official acts except such as is due to integrity and justice and consistent with the principles on which our institutions have been framed.
    Andrew Jackson (1767–1845)

    A nation fights well in proportion to the amount of men and materials it has. And the other equation is that the individual soldier in that army is a more effective soldier the poorer his standard of living has been in the past.
    Norman Mailer (b. 1923)