Cache-oblivious Algorithm

Examples

For example, it is possible to design a variant of unrolled linked lists which is cache-oblivious and allows list traversal of elements in time, where is the cache size in elements. For a fixed, this is time. However, the advantage of the algorithm is that it can scale to take advantage of larger cache line sizes (larger values of ).

The simplest cache-oblivious algorithm presented in Frigo et al. is an out-of-place matrix transpose operation (in-place algorithms have also been devised for transposition, but are much more complicated for non-square matrices). Given m×n array A and n×m array B, we would like to store the transpose of in . The naive solution traverses one array in row-major order and another in column-major. The result is that when the matrices are large, we get a cache miss on every step of the column-wise traversal. The total number of cache misses is .

The cache-oblivious algorithm has optimal work complexity and optimal cache complexity . The basic idea is to reduce the transpose of two large matrices into the transpose of small (sub)matrices. We do this by dividing the matrices in half along their larger dimension until we just have to perform the transpose of a matrix that will fit into the cache. Because the cache size is not known to the algorithm, the matrices will continue to be divided recursively even after this point, but these further subdivisions will be in cache. Once the dimensions and are small enough so an input array of size and an output array of size fit into the cache, both row-major and column-major traversals result in work and cache misses. By using this divide and conquer approach we can achieve the same level of complexity for the overall matrix.

(In principle, one could continue dividing the matrices until a base case of size 1×1 is reached, but in practice one uses a larger base case (e.g. 16×16) in order to amortize the overhead of the recursive subroutine calls.)

Most cache-oblivious algorithms rely on a divide-and-conquer approach. They reduce the problem, so that it eventually fits in cache no matter how small the cache is, and end the recursion at some small size determined by the function-call overhead and similar cache-unrelated optimizations, and then use some cache-efficient access pattern to merge the results of these small, solved problems.

Read more about this topic: Cache-oblivious Algorithm

Famous quotes containing the word examples:

“There are many examples of women that have excelled in learning, and even in war, but this is no reason we should bring ‘em all up to Latin and Greek or else military discipline, instead of needle-work and housewifry.”
—Bernard Mandeville (1670–1733)

“Histories are more full of examples of the fidelity of dogs than of friends.”
—Alexander Pope (1688–1744)

“It is hardly to be believed how spiritual reflections when mixed with a little physics can hold people’s attention and give them a livelier idea of God than do the often ill-applied examples of his wrath.”
—G.C. (Georg Christoph)

Cache-oblivious Algorithm - Examples

Famous quotes containing the word examples: