Performance Equation
The total amount of time (t) required to execute a particular benchmark program is
- , or equivalently
where
- P = 1/t is "the performance" in terms of time-to-execute
- N is the number of instructions actually executed (the instruction path length). The code density of the instruction set strongly affects N. The value of N can either be determined exactly by using an instruction set simulator (if available) or by estimation—itself based partly on estimated or actual frequency distribution of input variables and by examining generated machine code from an HLL compiler. It cannot be determined from the number of lines of HLL source code. N is not affected by other processes running on the same processor. The significant point here is that hardware normally does not keep track of (or at least make easily available) a value of N for executed programs. The value can therefore only be accurately determined by instruction set simulation, which is rarely practiced.
- f is the clock frequency in cycles per second.
- C= is the average cycles per instruction (CPI) for this benchmark.
- I= is the average instructions per cycle (IPC) for this benchmark.
Even on one machine, a different compiler or the same compiler with different compiler optimization switches can change N and CPI—the benchmark executes faster if the new compiler can improve N or C without making the other worse, but often there is a trade-off between them—is it better, for example, to use a few complicated instructions that take a long time to execute, or to use instructions that execute very quickly, although it takes more of them to execute the benchmark?
A CPU designer is often required to implement a particular instruction set, and so cannot change N. Sometimes a designer focuses on improving performance by making significant improvements in f (with techniques such as deeper pipelines and faster caches), while (hopefully) not sacrificing too much C—leading to a speed-demon CPU design. Sometimes a designer focuses on improving performance by making significant improvements in CPI (with techniques such as out-of-order execution, superscalar CPUs, larger caches, caches with improved hit rates, improved branch prediction, speculative execution, etc.), while (hopefully) not sacrificing too much clock frequency—leading to a brainiac CPU design. For a given instruction set (and therefore fixed N) and semiconductor process, the maximum single-thread performance (1/t) requires a balance between brainiac techniques and speedracer techniques.
Read more about this topic: Computer Performance
Famous quotes containing the words performance and/or equation:
“There are people who think that wrestling is an ignoble sport. Wrestling is not sport, it is a spectacle, and it is no more ignoble to attend a wrestled performance of suffering than a performance of the sorrows of Arnolphe or Andromaque.”
—Roland Barthes (19151980)
“Jail sentences have many functions, but one is surely to send a message about what our society abhors and what it values. This week, the equation was twofold: female infidelity twice as bad as male abuse, the life of a woman half as valuable as that of a man. The killing of the woman taken in adultery has a long history and survives today in many cultures. One of those is our own.”
—Anna Quindlen (b. 1952)