Lab 3b: Materialize Your Processor - Control

CSE 141L, Spring 2007, Donghwan Jeon

Due 5/29(Tu)

You should work on this lab with your team from lab 2.


Updates

Overview

Finally, it is time to make your processor work. In lab 3b, you will implement the control logic of the processor on top of the datapath implementation you made in lab 3a, and verify it with behavioral and post-route simulations. First, you will write your own testbench which tests all the instructions your ISA support. Then your processor will run our official benchmark - SuperGarbage. You will be provided an automated test environment to easily load applications into your processor and evaluate their performances.

Deliverables

Control Logic

Q1. Complete your design by implementing control logic. Keep your verilog as clean as possible. Naming convention, proper indentation, and style will be graded. Include all the source files in your hardcopy report.

Instruction Test

The first test your processor has to pass is a simple instruction test. Write a simple assembly program which uses all the instructions in your ISA except in and out. in and out will be tested in SuperGarbage benchmark with a provided testing environment. Generate a COE file either manually or with your assembler, and verify that your processor successfully handles all the instructions in a behavioral simulation. Try to make your test program as concise as possible while it covers all the instructions.

Q2. Include your test assembly file in the hardcopy report, and explain your testing strategy. Make sure that it covers all the instructions in your ISA. Also include the behavioral simulation result in the report, and explain the simulation result.

SuperGarbage

In lab 3b, the main test your processor has to pass is SuperGarbage. As you might have already noticed, SuperGarbage is a simple virtual machine. With a VM, you can run any SuperGarbage applications regardless of your ISA. As far as your SuperGarbage implementation is correct, you will be able to run any SuperGarbage applications, which allows us provide the class with standard applications that run on everybody's ISA, saving us all a significant amount of effort. You are provided with io_devices.v module, which supplies SuperGarbage applications to your processor. It also provides a clock counter for performance evaluations.

io_devices.v

Download the following two files, and add them to your Xilinx project.

You might need to change top.v file if you have modified core interface. As you can see in top.v file, the device module communicates with your core module via the I/O interface. The interface of the io_devices module is as follows:

module io_devices#(parameter D_WIDTH = 34, PA_WIDTH = 4)
(
	input 	reset,
	input 	clk,
	
	input	read_req,
	input	write_req,
	input	[PA_WIDTH-1 : 0] read_addr,
	input	[PA_WIDTH-1 : 0] write_addr,
	input	[D_WIDTH-1 : 0] din,
	output	[D_WIDTH-1 : 0] dout,
	output	read_ack,
	output	write_ack
);

The io_devices module provides the following services via I/O channels:

Input Channels

Output Channels

Q3. Now you can test in and out instructions with the given io_devices.v module. Modify your assembly test program in Q2 so that you can test in and out instructions. The simplest way to test those instructions is to manipulate clock counter. Set a counter value using output channel #2, and request the clock counter number to see whether it was successfully updated. Include the behavioral simulation result in the report, and explain it.

Write a simple loader

To run a SuperGarbage application, your processor should properly load the application first into the data memory. The figure below shows the format of the SuperGarbage binary file and how a loader works. After selecting the application to load, the loader requests the selected SuperGarbage binary file via input channel #1, word by word. For every two words, the loader stores the second word at the address indicated by the first word. In the figure below, the loader first reads address 3 and data 0, so it stores data 0 into address 3. The loader continues loading data until the specified address is -1 (0x3FFFFFFFF), which means the end of the file. The data for the address -1 indicates the entry point of the SuperGarbage program. After loading an application, the loader should call SuperGarbage VM with the starting address of the SuperGarbage memory and the entry point of the program.

The following java-style pseudo-code shows how to load a program from the io_devices module and start SuperGarbage VM. You can choose an application among three provided SuperGarbage applications by changing num_app variable in the pseudo-code. You should write a loader in your assembly language, build coe files, and generate memory module. If possible, verify your loader in the simulator you wrote in lab2.

// SuperGarbage Loader
// num _app
// 0: app0.bin
// 1: app1.bin
// 2: app2.bin

final int num_app = 0;

word mem[4096];
word startPC;

// basic code stub
set $SP;
set $GP;

// select an app
out( num_app, 1);

// load data from an external device while(addr != -1)
do {
    word addr, data;
    addr = in(0x1);
    data = in(0x1);
    
    if (addr == 0x3FFFFFFFFL) {
        startPC = data;
        break;
        
    } else {
        mem[addr] = data;
    }
} while(true);

// Finally, call SuperGarbage VM
SuperGarbage(startPC, mem);

SuperGarbage Applications

In lab 3b, you will use the following three SuperGarbage applications to test your design. Put them in your Xilinx project directory. Applications get inputs from input channel #3, which is fed by the input file. If you want to chage input data, modify the input file for the corresponding application.

Benchmark Assembly File Binary File Input File Counter Input File
Compare
app0.s
app0.bin
app0.in
app0.cnt
Fibonacci
app1.s
app1.bin
app1.in
app1.cnt
Bubble Sort
app2.s
app2.bin
app2.in
app2.cnt

Q4. Run your loader with the first application, and make sure that your loader is working. Include your loader assembly source file in the hardcopy report.

SuperGarbageSim

Although SuperGarbage VM is very simple, it is not easy to understand a SuperGarbage code. To help you understand the behavior of SuperGarbage applications, you are provided with a SuperGarbageSim, a reference simulator for SuperGarbage applications. SuperGarbageSim can load and execute any SuperGarbage application binary file. You can also monitor or modify contents of memory and the program counter. The SuperGarbageSim supports the following commands:

Download the following java files, and compile.

You can start SuperGarbageSim with the following command:

 prompt> java SuperGarbageSim 

Here are a few examples of SuperGarbageSim commands:

 prompt> load test1.bin      // load 'test1.bin' file
 prompt> disasm              // disassembles 10 instructions from PC
 prompt> disasm 0x10 15      // disassembles 15 instructions from 0x10 
 prompt> set_mem 14 0x0      // set the memory location 14(0xe) to 0x0 

SuperGarbage Performance

All three provided SuperGarbage applications measure their execution times by using the counter service implemented in io_devices.v. They calculate the execution time by requesting counter value at the beginning and end of the execution and getting the difference. Then, they send the number of cycles for the execution to the debug output channel (output #3) so that you can easily see it.

Q5. Run all three benchmark applications and get the number of excution cycles for each application. Also get the dynamic instruction count for each benchmark from the 'count' command of your ISA simulator, or adding an instruction counter to your processor which counts the number of executed instructions. What is the CPI (cycles per instruction) for each benchmark? Fill out the following table. Why do CPI vary across applications?

 
  Benchmark 0 Benchmark 1 Benchmark 2
# of Cycles for the Execution
 
 
 
Dynamic Instruction Count
 
 
 
CPI (Cycles per Instruction)
 
 
 

Evaluation

Now that you have passed two important tests, let's evaluate the performance of your processor. Perform a postroute simulation with the testbench you wrote in Q2.

Q6. What is the maximum achievable frequency of your processor? Include a screen capture on the post-route simulation result at the frequency in your hardcopy report. Calculate the execution time for all three benchmarks.

Q7. What is the critical path of your processor? Draw the critical path on the datapath schematic you made in lab 3a.

Q8. Propose a way that will improve the performance of your processor. Why do you think it is an effective way to boost the performance?

Tips