Asynchronous Circuit - Asynchronous CPU

Asynchronous CPU

Asynchronous CPUs are one of several ideas for radically changing CPU design.

Unlike a conventional processor, a clockless processor (asynchronous CPU) has no central clock to coordinate the progress of data through the pipeline. Instead, stages of the CPU are coordinated using logic devices called "pipeline controls" or "FIFO sequencers." Basically, the pipeline controller clocks the next stage of logic when the existing stage is complete. In this way, a central clock is unnecessary. It may actually be even easier to implement high performance devices in asynchronous, as opposed to clocked, logic:

  • components can run at different speeds on an asynchronous CPU; all major components of a clocked CPU must remain synchronized with the central clock;
  • a traditional CPU cannot "go faster" than the expected worst-case performance of the slowest stage/instruction/component. When an asynchronous CPU completes an operation more quickly than anticipated, the next stage can immediately begin processing the results, rather than waiting for synchronization with a central clock. An operation might finish faster than normal because of attributes of the data being processed (e.g., multiplication can be very fast when multiplying by 0 or 1, even when running code produced by a naive compiler), or because of the presence of a higher voltage or bus speed setting, or a lower ambient temperature, than 'normal' or expected.

Asynchronous logic proponents believe these capabilities would have these benefits:

  • lower power dissipation for a given performance level, and
  • highest possible execution speeds.

The biggest disadvantage of the clockless CPU is that most CPU design tools assume a clocked CPU (i.e., a synchronous circuit). Many tools "enforce synchronous design practices". Making a clockless CPU (designing an asynchronous circuit) involves modifying the design tools to handle clockless logic and doing extra testing to ensure the design avoids metastable problems. The group that designed the AMULET, for example, developed a tool called LARD to cope with the complex design of AMULET3.

Despite the difficulty of doing so, numerous asynchronous CPUs have been built, including:

  • the ORDVAC and the (identical) ILLIAC I (1951)
  • the Johnniac (1953)
  • the WEIZAC (1955)
  • the ILLIAC II (1962)
  • The Victoria University of Manchester built Atlas
  • The Honeywell CPUs 6180 (1972) and Series 60 Level 68 (1981) upon which Multics ran asynchronously
  • The Caltech Asynchronous Microprocessor, the world-first asynchronous microprocessor (1988);
  • the ARM-implementing AMULET (1993 and 2000);
  • the asynchronous implementation of MIPS R3000, dubbed MiniMIPS (1998);
  • several versions of the XAP processor experimented with different asynchronous design styles: a bundled data XAP, a 1-of-4 XAP, and a 1-of-2 (dual-rail) XAP (2003?);
  • an ARM-compatible processor (2003?) designed by Z. C. Yu, S. B. Furber, and L. A. Plana; "designed specifically to explore the benefits of asynchronous design for security sensitive applications";
  • the "Network-based Asynchronous Architecture" processor (2005) that executes a subset of the MIPS architecture instruction set;
  • the ARM996HS processor (2006) from Handshake Solutions
  • the HT80C51 processor (2007???) from Handshake Solutions
  • the SEAforth multi-core processor (2008) from Charles H. Moore.
  • the GA144 multi-core processor (2010) from Charles H. Moore.

The ILLIAC II was the first completely asynchronous, speed independent processor design ever built; it was the most powerful computer at the time.

DEC PDP-16 Register Transfer Modules (ca. 1973) allowed the experimenter to construct asynchronous, 16-bit processing elements. Delays for each module were fixed and based on the module's worst-case timing.

The Caltech Asynchronous Microprocessor (1988) was the first asynchronous microprocessor (1988). Caltech designed and manufactured the world's first fully Quasi Delay Insensitive processor. During demonstrations, the researchers amazed viewers by loading a simple program which ran in a tight loop, pulsing one of the output lines after each instruction. This output line was connected to an oscilloscope. When a cup of hot coffee was placed on the chip, the pulse rate (the effective "clock rate") naturally slowed down to adapt to the worsening performance of the heated transistors. When liquid nitrogen was poured on the chip, the instruction rate shot up with no additional intervention. Additionally, at lower temperatures, the voltage supplied to the chip could be safely increased, which also improved the instruction rate—again, with no additional configuration.

In 2004, Epson manufactured the world's first bendable microprocessor called ACT11, an 8-bit asynchronous chip. Synchronous flexible processors are slower, since bending the material on which a chip is fabricated causes wild and unpredictable variations in the delays of various transistors, for which worst case scenarios must be assumed everywhere and everything must be clocked at worst case speed. The processor is intended for use in smart cards, whose chips are currently limited in size to those small enough that they can remain perfectly rigid.

Read more about this topic:  Asynchronous Circuit