April 24 | We found that comparisons with a constant will end up using the slti and/or sltiu instructions. Therefore, we are adding them to the list of instructions below. Since this mistake was detected early, we hope it's not too much trouble to require you to implement it, but if you feel otherwise, let us know and we'll try to work something out. |
May 2 | There was a bug with the ALU that's fixed now. Please download it again. |
Due: May 06
At the end of the lab, you will be asked to compile two different versions of a program: one without compiler optimizations and one with compiler optimizations. Unfortunately, we found out that the compiler uses shifts, which were not part of the original ALU design provided. We have thus augmented the ALU to include shift functionality, the source code for which can be found here. The new documentation for the ALU is shown below.
module alu( input [5:0] Func_in, input [31:0] A_in, input [31:0] B_in, output [31:0] O_out, output Branch_out, output Jump_out ); |
Func_in | Operation | O_out value | Branch_out | Jump_out |
000X00 | Shift Left Logical | (B << A) | 0 | 0 |
000X10 | Shift Right Logical | (B >> A) | 0 | 0 |
000X11 | Shift Right Arithmetic | (B >>> A) | 0 | 0 |
100000 | ADD | A+B | 0 | 0 |
100001 | ADD | A+B | 0 | 0 |
100010 | SUB | A-B | 0 | 0 |
100011 | SUB | A-B | 0 | 0 |
100100 | AND | A AND B | 0 | 0 |
100101 | OR | A OR B | 0 | 0 |
100110 | XOR | A XOR B | 0 | 0 |
100111 | NOR | A NOR B | 0 | 0 |
101000 | Set-Less-Than Signed | (signed(A) < signed(B)) | 0 | 0 |
101001 | Set-Less-Than Unsigned | (A < B) | 0 | 0 |
111000 | Branch Less Than Zero | A | (A < 0) | 0 |
111001 | Branch Greater Than or Equal to Zero | A | (A >= 0) | 0 |
111010 | Jump | A | 0 | 1 |
111011 | Jump | A | 0 | 1 |
111100 | Branch Equal | A | (A == B) | 0 |
111101 | Branch Not Equal | A | (A != B) | 0 |
111110 | Branch Less Than or Equal to Zero | A | (A <= 0) | 0 |
111111 | Branch Greater Than Zero | A | (A > 0) | 0 |
The programs that you will generate for Lab 4 will be too large to fit in a 1K instruction ROM. Therefore, please modify your instruction ROM to have an address width of 10 bits. For example, in my processor.v module, I instantiated the instruction ROM like this:
// Instruction ROM InstantiationIf you get errors about a memh file being too large, this is probably your problem. Let us know if you have any problems with this.
For those of you who have not quite gotten Lab 3 to work, we will post an example version by Wednesday after 3pm (after the late penalty deadline).
In Lab 4, you will finalize the design of your single-cycle MIPS processor. Specifically, you will be adding support for branch and jump instructions. In addition, you will add support for a sufficient subset of the MIPS ISA to enable you to run many common applications. At the end of this lab, you will be able to write your own applications, compile them using the MIPS cross-compiler toolchain, and simulate your design in Modelsim.
This lab (and the next two) will require much more work than the previous labs and will probably take longer than you expect, so start early.
In Lab 3, you added a control unit that supported a few basic instructions: LW, SW, ADD, ADDI, SUB, AND, OR, NOR, XOR. However, we will need to add a few more instructions to run a typical application.
This is the subset of the MIPS ISA that you will need to implement by the end of this lab. You will need to modify many parts of your datapath, as well as expanding your control unit. Depending on your implementation, you may need to modify your ALU. You will need to create new modules. We will not be providing a schematic for you for this lab. Before testing your project with the example applications below, you will be required to implement the remaining instructions. Use page B-49 of your textbook (either edition) to see how each instruction is encoded.
Branches & Jumps | Loads & Stores | Arithmetic & Logic | Shifting |
There are many types of the branch instruction, but their basic function is to change the program counter (PC) based on the evaluation of some condition. This instruction enables the if-else, while loop, and for loop contructs in typical programming languages. The jump instruction is similar in that it is capable of changing the PC, but does not require evaluating any condition. The jump-and-link instructions are used for function calls.
Your design will not include a branch delay slot.
As a reference, you should use Figure 4.24 on page 329 in the textbook (either edition). The top third of the figure includes most of the logic that you will need to implement. Be advised that the figure does not include the logic necessary to implement JR, JAL, and JALR.
The SB instruction stores the low-byte of a register to memory. The SH instruction is similar, except it stores the low two bytes of a register to memory. We have already included support in the async_memory module for the SB and SH instruction. The data_memory module has a port called size_in. You will have to modify your control unit to set this signal appropriately. It is a 2-bit input, but only accepts three possible values:
The LB instruction returns a sign-extended byte. The instructions lets you specify any byte within the four-byte word (using the last 2 bits of the address). Therefore, you must include logic for shifting the output of the data memory to be in the least-significant position and sign-extending it. In other words, if I request the byte at address 0x03 and the word at that location is 0x80000000, I should return 0xFFFFFF80. The LH word is similar except it returns a shifted, sign-extended two-byte half instead of a single byte.
The LBU instruction is similar to the LB instruction, except the byte is zero-extended. Therefore, in the previous example, if I request the byte at address 0x03 and the word at that location is 0x80000000, I should return 0x00000080. The LHU instruction acts similarly except it returns a half-word (two bytes) instead of a byte.
Your design will not include a load delay slot.
By and large, the shifting functions are similar to the standard R-type. The only difference is what you multiplex into the "A" ALU operand. Depending on whether you're doing a variable or constant shift, you will need to multiplex in either: the value read from the register specified by bits 21-25 of your instruction, or the value pertaining to bits 6-10 of your instruction.
In order to simulate the programs you will create for this lab, you should follow the same instructions you used for Lab 3. Namely, you should remember to set the INIT_PROGRAM parameter for the instruction ROM and data memory.
A cross-compiler has been created for you and is located on ieng6.ucsd.edu. Many of you already have accounts to this machine, so you can use your regular username and password. If you do not have an account, please contact us on the staff mailing list.
After you have logged into the ieng6.ucsd.edu, run the following commands:
$ prep cs141sThis will download and extract the files you will need for this lab. Here is a listing of the important files. Take a look around at what is going on in each file.
You can compile the programs by running 'make' in that directory. If you look at the contents of the directory, you will see that each of the programs has produced the appropriate memh files needed for simulation. It will also produce a *.dis file, which is the assembly output of the compiler.
If you wish to create your own application, feel free to copy fib.c or helloworld.c as a starting point. Use the nbhelloworld.s as a starting point for creating a new program in assembly. You will have to edit the top few lines of the makefile to make sure your program gets compiled. The instructions are located at the top of Makefile, but they are repeated here for completeness.
#For each application you want to build, add a new line such as:If I wanted to add a new program "sort" that is located in sort.c and uses the lib141 library, I would make the following changes to the top of the Makefile:
#For each application you want to build, add a new line such as:Now that we have described the cross-compiler, you can use it to create your own test programs. In this part of the lab, you will be required to test each of your new instructions. You will want to write your test programs in assembly in this stage to guarantee that the compiler produces the correct commands.
You may write programs that test a single instruction or that test multiple instructions, but be sure to test every instruction (including the instructions from the previous lab). Don't forget to add your programs to the Makefile as described in the previous section. If you do not see the memh files that are prefixed with your program name after running 'make', it is probably because you forgot to add the correct targets to the Makefile.
When you are satisfied that your control unit is behaving correctly, its time to move on to testing larger programs.
In Lab 3, you ran a simplified version of "Hello World" that avoided the use of any branch or jump instructions. We were able to do this by not testing the serial interface to see if it was ready to transmit data. Now that we have branches, however, we can test the serial interface by reading the serial output status register.
The files in lab4-file.zip configure the compiler to not place useful instructions in the branch delay slot or load delay slots. Instead, it always put a nop in the slot. We have configured the compiler this way because the processor you will build during this lab does not have a branch delay or load delay slot. Instead, branches occur immediately, and the results of loads are available immediately.
In this section, you will experiment with benchmarking two different version of a program that prints out the first 10 Fibonacci numbers (0,1,1,2,3,5,8,13,21,34): one that computes the result iteratively and one that uses recursion. We have already provided an initial implementation of an iterative version of this program called fib.c in the zip file you extracted. Feel free to modify this to make it faster. Write a new version of the program that uses recursion to calculate the sequence. Compile and simulate both versions and record your results in the table for Question 6.
Now we will investigate compiler optimizations. Open the Makefile and find/edit the following line:
CCFLAGS=-fno-builtin -nodefaultlibs -nostdlib -c -mips1 -mno-abicalls -static -O4This will turn on some compiler optimizations. Run a 'make clean' to remove previous versions of your programs and re-compile and simulate. Record your new results in the table for Question 6.
Iterative Fib No Opts | Recursive Fib No Opts | Iterative Fib w/Opts | Recursive Fib w/Opts | |
Number of Instructions (excluding NOPs) in *.dis file |
||||
Number of Cycles | ||||
Cycles Per Instruction (CPI) |
Deliverable |
|
Due: May 06 |