2nd International Symposium on High Performance Computer Architecture,
pages 244-253, February 1996
Traditionally, set-associative caches are implemented by comparing all blocks in a cache set in parallel for each reference and then selecting the desired block from the set. By providing more than one location for holding the data for a particular memory address, set associativity reduces the cache miss rate for most programs. The traditional solution is, however, not without cost. As contrasted with direct-mapped caches, values from a set-associative cache can not be used by the processor, even speculatively, until all the comparisons complete. This results in a greater access time than comparably sized direct-mapped caches. Fortunately, however, in most programs, cache references are to the most recently used items. A number of researchers have proposed cache designs that exploit this locality to reduce access time and miss rate.
In this paper, we propose a cache design that provides the same miss rate as a two-way set associative cache, but with a access time closer to a direct-mapped cache. As with other designs, a traditional direct-mapped cache is conceptually partitioned into multiple banks, and the blocks in each set are probed, or examined, sequentially. Other designs either probe the set in a fixed order or add extra delay in the access path for all accesses. We use prediction sources to guide the cache examination, reducing the amount of searching and thus the average access latency. A variety of accurate prediction sources are considered, with some being available in early pipeline stages. We feel that our design offers the same or better performance and is easier to implement that previous designs.