Asynchronous I/O - Implementation

Implementation

The vast majority of general-purpose computing hardware relies entirely upon two methods of implementing asynchronous I/O: polling and interrupts. Usually both methods are used together, the balance depends heavily upon the design of the hardware and its required performance characteristics. (DMA is not itself another independent method, it is merely a means by which more work can be done per poll or interrupt.)

Pure polling systems are entirely possible, small microcontrollers (such as systems using the PIC) are often built this way. CP/M systems could also be built this way (though rarely were), with or without DMA. Also, when the utmost performance is necessary for only a few tasks, at the expense of any other potential tasks, polling may also be appropriate as the overhead of taking interrupts may be unwelcome. (Servicing an interrupt requires time to save at least part of the processor state, along with the time required to resume the interrupted task.)

Most general-purpose computing systems rely heavily upon interrupts. A pure interrupt system may be possible, though usually some component of polling is also required, as it is very common for multiple potential sources of interrupts to share a common interrupt signal line, in which case polling is used within the device driver to resolve the actual source. (This resolution time also contributes to an interrupt system's performance penalty. Over the years a great deal of work has been done to try to minimize the overhead associated with servicing an interrupt. Current interrupt systems are rather lackadaisical when compared to some highly-tuned earlier ones, but the general increase in hardware performance has greatly mitigated this.)

Hybrid approaches are also possible, wherein an interrupt can trigger the beginning of some burst of asynchronous I/O, and polling is used within the burst itself. This technique is common in high-speed device drivers, such as network or disk, where the time lost in returning to the pre-interrupt task is greater than the time until the next required servicing. (Common I/O hardware in use these days relies heavily upon DMA and large data buffers to make up for a relatively poorly-performing interrupt system. These characteristically use polling inside the driver loops, and can exhibit tremendous throughput. Ideally the per-datum polls are always successful, or at most repeated a small number of times.)

At one time this sort of hybrid approach was common in disk and network drivers where there was not DMA or significant buffering available. Because the desired transfer speeds were faster even than could tolerate the minimum four-operation per-datum loop (bit-test, conditional-branch-to-self, fetch, and store), the hardware would often be built with automatic wait state generation on the I/O device, pushing the data ready poll out of software and onto the processor's fetch or store hardware and reducing the programmed loop to two operations. (In effect using the processor itself as a DMA engine.) The 6502 processor offered an unusual means to provide a three-element per-datum loop, as it had a hardware pin that, when asserted, would cause the processor's Overflow bit to be set directly. (Obviously one would have to take great care in the hardware design to avoid overriding the Overflow bit outside of the device driver!)

Read more about this topic: Asynchronous I/O