The figure below shows write operations in dmem.v. All memory locations are initilized to 0x0 before these operations. A write to address 0x5 was refused. Also note that dout is always 0x3FFFFFFFF since no read request was made. For your processor implementation, you have to request the same operation until it is not refused.
On the other hand, the figure below shows read operations in dmem.v. As you might expect, a read to 0x5 returns the initial value 0x0, becaused the previous write operation to that address was refused. A read request to 0x6 is refused, asserting 0x3FFFFFFFF to dout.
Q1. Complete your design by implementing control logic. Keep your verilog as clean as possible. Naming convention, proper indentation, and style will be graded. Include all the source files in your hardcopy report.
The first test your processor has to pass is a simple instruction test. Write a simple assembly program which uses all the instructions in your ISA except in and out. in and out will be tested in SuperGarbage benchmark with a provided testing environment. Generate a COE file either manually or with your assembler, and verify that your processor successfully handles all the instructions in a behavioral simulation. Try to make your test program as concise as possible while it covers all the instructions.
Q2. Include your test assembly file in the hardcopy report, and explain your testing strategy. Make sure that it covers all the instructions in your ISA. Also include the behavioral simulation result in the report, and explain the simulation result.
Download the following two files, and add them to your Xilinx project.
You might need to change top.v file if you have modified core interface. As you can see in top.v file, the device module communicates with your core module via the I/O interface. The interface of the io_devices module is as follows:
module io_devices#(parameter D_WIDTH = 34, PA_WIDTH = 4) ( input reset, input clk, input read_req, input write_req, input [PA_WIDTH-1 : 0] read_addr, input [PA_WIDTH-1 : 0] write_addr, input [D_WIDTH-1 : 0] din, output [D_WIDTH-1 : 0] dout, output read_ack, output write_ack );
The io_devices module provides the following services via I/O channels:
Input Channels
Output Channels
Q3. Now you can test in and out instructions with the given io_devices.v module. Modify your assembly test program in Q2 so that you can test in and out instructions. The simplest way to test those instructions is to manipulate clock counter. Set a counter value using output channel #2, and request the clock counter number to see whether it was successfully updated. Include the behavioral simulation result in the report, and explain it.
To run a SuperGarbage application, your processor should properly load the application first into the data memory. The figure below shows the format of the SuperGarbage binary file and how a loader works. After selecting the application to load, the loader requests the selected SuperGarbage binary file via input channel #1, word by word. For every two words, the loader stores the second word at the address indicated by the first word. In the figure below, the loader first reads address 3 and data 0, so it stores data 0 into address 3. The loader continues loading data until the specified address is -1 (0x3FFFFFFFF), which means the end of the file. The data for the address -1 indicates the entry point of the SuperGarbage program. After loading an application, the loader should call SuperGarbage VM with the starting address of the SuperGarbage memory and the entry point of the program.
The following java-style pseudo-code shows how to load a program from the io_devices module and start SuperGarbage VM. You can choose an application among three provided SuperGarbage applications by changing num_app variable in the pseudo-code. You should write a loader in your assembly language, build coe files, and generate memory module. If possible, verify your loader in the simulator you wrote in lab2.
// SuperGarbage Loader
// num _app
// 0: app0.bin
// 1: app1.bin
// 2: app2.bin
final int num_app = 0;
word mem[4096];
word startPC;
// basic code stub
set $SP;
set $GP;
// select an app
out( num_app, 1);
// load data from an external device while(addr != -1)
do {
word addr, data;
addr = in(0x1);
data = in(0x1);
if (addr == 0x3FFFFFFFFL) {
startPC = data;
break;
} else {
mem[addr] = data;
}
} while(true);
// Finally, call SuperGarbage VM
SuperGarbage(startPC, mem);
In lab 3b, you will use the following three SuperGarbage applications to test your design. Put them in your Xilinx project directory. Applications get inputs from input channel #3, which is fed by the input file. If you want to chage input data, modify the input file for the corresponding application.
| Benchmark | Assembly File | Binary File | Input File | Counter Input File |
|---|---|---|---|---|
Q4. Run your loader with the first application, and make sure that your loader is working. Include your loader assembly source file in the hardcopy report.
Download the following java files, and compile.
You can start SuperGarbageSim with the following command:
prompt> java SuperGarbageSim
Here are a few examples of SuperGarbageSim commands:
prompt> load test1.bin // load 'test1.bin' file prompt> disasm // disassembles 10 instructions from PC prompt> disasm 0x10 15 // disassembles 15 instructions from 0x10 prompt> set_mem 14 0x0 // set the memory location 14(0xe) to 0x0
Q5. Run all three benchmark applications and get the number of excution cycles for each application. Also get the dynamic instruction count for each benchmark from the 'count' command of your ISA simulator, or adding an instruction counter to your processor which counts the number of executed instructions. What is the CPI (cycles per instruction) for each benchmark? Fill out the following table. Why do CPI vary across applications?
| Benchmark 0 | Benchmark 1 | Benchmark 2 | |
|---|---|---|---|
Q6. What is the maximum achievable frequency of your processor? Include a screen capture on the post-route simulation result at the frequency in your hardcopy report. Calculate the execution time for all three benchmarks.
Q7. What is the critical path of your processor? Draw the critical path on the datapath schematic you made in lab 3a.
Q8. Propose a way that will improve the performance of your processor. Why do you think it is an effective way to boost the performance?
00000000000000000 00000000000000001 00000000000000010 00000000000000011 00000000000000100 00000000000000101 00000000000000110 00000000000000111 00000000000001000 ...........