Journal of Systems Architecture, pages 651-679, Vol. 45, 1999.
Accurate instruction fetch and branch prediction is increasingly important on today's superscalar architectures. Fetch prediction is the process of determining the next instruction to request from the memory subsystem. Branch prediction is the process of predicting the likely out-come of branch instructions. A branch target buffer (BTB) is often used to provide target addresses for taken branches and to predict the destination of indirect jumps. Using a BTB avoids the delay needed to recalculate the destination address and reduces the misfetch penalty. However, an effective branch target buffer can be large and can possibly increase the cycle time of a processor.
We propose that a design used in older computers, such as the PDP-8, be used in modern architectures instead of a BTB design. The compiler would pre-compute the branch destination for most branch instructions, allowing the branch information to be stored with the instruction. We consider computing branch destinations at link time and as instructions are fetched into the instruction cache; both alternatives offer similar performance with different advantages. A very small branch target buffer is still useful to predict indirect branches, which can not be pre-computed. Our results show that the Precomputed-Branch architecture performs better than an architecture using only a branch target buffer, and has significant hardware savings. This is particularly true for larger programs more representative of modern applications.