-
* Replace the last sentence of A.3 with just "How much faster would the
machine be without any branch hazards." Please disregard the answer
in the book -- I don't like the way they solve it, anyway.
-
** Assume the new format only applies to source operands, not destination
operands.
-
P2: Consider the hyper-pipelined processor, with a 30-stage, in-order,
scalar pipeline. Instructions are fetched in the 3rd stage, executed
in the 22nd stage, and memory is accessed in the 25th stage. Branch
target and condition are computed in the 20th stage, in a special branch-execute
stage.
What is the steady-state CPI of the following loop, assuming the loop
is taken many times, perfect branch prediction (ie, the correct instruction
is always fetched the cycle after the branch), and full forwarding support.
Loop: LW R5, 100(R2)
SUB R1, R3, R5
ADDI R7, R5, #12
LW R8, 0(R7)
ADD R2, R2, #1
ADD R8, R8, #8
BNEZ R2, Loop:
-
W1: Use the web, or other resources, to find the integer pipeline
length for the following processors: Pentium 4, Pentium III, Itanium,
Alpha 21264.