CSE 143 - Resources Page
CSE building 3219 (the embedded lab) will have some DE2 boards and
Quartus2 + Modelsim + NIOS II installed. Ask Robyn to have your card programmed to access the room.
Lab hours will be variable each week, keep posted to this page for latest hours.
Contact tweng at ucsd for additional information.
Lab hours for week of 5/17-23: wed 4-6pm, thurs 12-7, fri 1-7
Part 3 will be really easy. Here we have an encoder function that we run many times throughout our program. It takes two values (A and B), multiples A and B together to get C, then multiplies C with A. (Yes this is really trivial). Because we do this many times, this acutally takes up a lot of time in our software. Therefore, it becomes a good idea to make this a custom instruction. This will be a combinational custom instruction, so the module will have two inputs, A and B, and return a result, and this will only take one clock cycle so we do not have to worry about clock.
Step 1: View this video
, the part under custom instructions so 157+.
Step 2: Here is the software template file
. Set the system up similar to part 1. Make the software function work first. Then take this vhdl template
and implement the same thing in hardware.
Step 3: Attach it to the CPU as a custom instruction. The video demo does the same, but the PDF step by step instruction is . Look at lab 5. Theirs is a CRC which is more complicated than ours. Ours is really easy to attach. Note, the video points out that sometimes you have to do add the thing twice, cuz of a bug when it doesn't display. So if you don't see it after the first time, add the custom instruction again.
Custom instruction manual.
. There is a weird bug with this, that after you add the custom instruction to your nios 2 processor core, it doesn't show on your screen. So to get out of this, close the sopc builder and reopen it, and go back to the processor core, and repeat until you see it on the sopc builder screen. If you still don't see your custom inst, then right click on there and go to show all and sometimes that works.
Step 4: After regenerating your processor, close and reopen your nios IDE. You want to edit your C file to now include the custom instruction. Open up system.h (on the right side of the IDE you should see the headers stuff listed) and scroll down until you see it defined as a macro. It will look something like "#define ALT_CI_ENCODER_INST_N 0x00000000
#define ALT_CI_ENCODER_INST(A,B) __builtin_custom_inii(ALT_CI_ENCODER_INST_N,(A),(B))". You can use the instruction like that in your code. Complete the code and run the code again and compare the time differences. Copy and paste the results and that will be the result file you submit!
Here is a general gist for part 2 and some files I made. Basically, you want to first design your matrix operator in VHDL. You want to design this as RTL, so you want both a control path and a data path. You are free to design this however you want, but this is what I did: The control path needs to determine when to enable the matrix modder essentially, as well as keeping track of how many iterations we have done so far. The datapath consists of the matrix multipler, the data registers (which store the data in between iterations), the modder unit, and the output register (which will output the data when the data is ready). The top level "super matrix" vhdl code then will connect these parts together. The overall connections will generally look like:
[ data registers ] -> [ matrix mult ] -> [ matrix modder ] -> (back to data register) and [output register]
--- control line telling matrix modder to mod
--- control line telling output register to load the output of the matrix modder
The supermatrix will have as inputs clk (to pass to to the registers and control unit), the matrix (which is represented by a linear array of 9 elements, with each element being an 8 bit std_logic vector), and a start bit, which tells our system to start.
This will initialize the system by loading the data registers with the initial 9 values, and start up the control unit too by sending it the clock signal. The outputs are a done bit, and the 32-bit result.
We need to make this into a component however that SOPC can understand and interface with the CPU. So we essentially wrap it
with the proper format. This consists of the Avalon bus, which is described in the Avalon Interface Spec. Basically though,
we transfer data over the bus via specified pins that the component must conform to. We will cheat a little, and pretend that the 9th value of the matrix we pass in is 0. We do this because then we can pass in 8 values of 8 bits each, and have only a width of only 64 bits. The real way to do this is to clock in the data, or we could have set a data width of 128 bits, but we will leave it at 64 for simplicity sake. The input to our component is called "write_data" and the output back out is called "read_data". This is kind of confusing but these names are based on the master. The master writes data to the component, and reads it from the component. Another line that we use is the begintransfer line, which is 1 when we start transfering data.
How this generally will work is that when the component gets a "begintransfer" from the Avalon bus, we send the "Start" signal to the supermatrix, which will then start that. We will also on the same clock cycle send in the matrix (which consists of 9 8-bit std_logic vectors). We wait until the supermatrix is finished, which we get from the "done" output from the supermatrix. When that goes high, we know we are done, and can retrieve the final result. We then can send that result back through the Avalon bus.
- the only thing relevent is chapter 3, specifically the bus interface and the bus timings.
Here are some files I was working on. They aren't complete at all and I haven't managed to test them, I'll try to get back to them later but it is a start. Files
5/30 - Added instructions and template files for part 1. A slight change to the function - we mod it by 200 every 4 iterations instead of 10.
5/28 - Get started on learning the overall flow of this lab. Take a look at the tutorials I linked to. There are only 5 of you in this class, so there will probably be some change up as this lab progresses. I'll try to make sure everyone knows what to do, and might change the parameters as needed, so check this page often. I'm hoping to make each step pretty clear, but you guys are all taking this class (and have already taken 140 and 141) which means you guys should be pretty decent with using EDA tools and following the tutorials. If you have any issues, let me know though! I also TA the CS140 class so it will be nice to talk to people that aren't confused by an ALU.
5/28 - The DE2 boards are in the lab. There are 5-6 of them, so all of you should be able to use them. Make sure you get your card key programmed so you can access the lab anytime. You can also do much of this from home, just install NIOS II on your computer. After next Friday, we might be able to let you take home the DE2 boards too.
5/28 - IMPORTANT! Please email me (tweng) and I'll add you to an email list for this project.
5/28 - Lab 2 has been assigned. Look at this page for the latest updates to it. It's a pretty involved lab exercise, but a lot of it will be guided. The CSE 140 class will be in the lab for the next week but after that will be gone. I'll update this as I can get up more material up (I'm essentially doing this lab with you guys, and want to make sure all the templates are good). For now get started on some of the video tutorials. So far I've added a tutorial section and design tips section. A lot of the time spent in this lab will be learning the overall flow, and using the EDA tools. This is a pretty advanced lab, but I thought it was more interesting than another contrived and difficult VHDL problem, as you'll learn to really use a high level EDA design tool, gain exposure to SOC design, HW/SW codesign, embedded programming, rapid prototyping and emulation, and ASIP design.
Lab 2 Tutorials:
To get a start on this lab, it would be good to go through some of the tutorials that are related to it. Altera has a very good video tutorial series that I will link to. They also have a lot of PDF tutorials and instruction manuals. You probably don't have to go through the instruction manuals, but their guided tutorials are really useful for this lab.
To start with, the video series is here:
Embedded HW curriculum
. The ones you want to target will be Designing with the NIOS II Processor and SOC Builder OEMB1116. It goes through the entire sequence of what you will be doing on this lab. It says "8 hours" but it's really about 3 hours. You can go through some of the other videos in this sequence.
Here are some of the useful PDFs and tutorials too:
Altera SOPC Tutorial
- This is a very good basic tutorial into using SOPC with the DE2 board. You can try this as a first project to make sure you can get that working. Note, this tutorial isn't correct for the IDE portion, so don't follow their steps when they start running the NIOS II IDE. Instead, follow the video tutorial for using NIOS IDE.
- The DE2 manual will be useful to review.
DE2 Digital Logic tutorials
- a collection of the Altera tutorials on Quartus, simulation, etc.
DE2 SOPC tutorials
- a collection of Altera tutorials on SOPC builder, NIOS2 introduction, etc.
Video tutorial PDF
- This was the PDF that comes with the 8 hour course. It has step by step instructions to do the major steps in this lab. Definately review this along with the accompanying OEMB1116 video. The overall steps will be the same.
NEW! This has some
good tutorials on implementing this on a DE2 board.
Lab 2 Part 1 Instructions:
Template C File for Part 1
Generally, for part 1, you will generally follow the steps layed out in the Altera SOPC tutorial. You will want to increase the on-chip memory size, because 4k is kind of small. When you go to make your NIOS II project, pick "small hello world." Having the printf is really nice because you can see the output on the IDE console (even in run mode!), but the normal stdio is way too large to program into the FPGA. Also, you want to run your matrix multiplication code. The program itself is quite easy, and shouldn't take you more than a few hours to write. One thing you can do is write this on your Linux account, because it is a lot faster to compile and run on a Unix machine than it is on the FPGA.
After going through that simple tutorial, there are a few changes we will do to it. Remember, select the weakest processor core. The reason is this makes multiply especially slow, which is kind of what we want. Also, you want the onchip memory
to be 45000 bytes, instead of 4kbytes which is too small. Also, you want to add an interval timer (under peripherals -> microcontroller peripherals). Go ahead and leave it as default. Regarding the LED stuff, you also don't really have to do the LED/switch stuff if you don't want, but it's cool to see it working with your DE2 board. This is up to you.
After generating this chip, make a project with a top level schematic, bring out your new SOC, and connect
the clock to it (and whatever other pins you have). Set the clock in pin planner to PIN_N2 under pin planner. After you do this, compile your project and program it on the board (tools -> programmer). Make sure you have the Quartus II USB blaster driver installed. Check out the CSE 140L lab 1 tutorials
for this if you need help.
Now you want to open up the NIOS II IDE. You can get to it through the SOPC builder. Click on the System Generation
tab and you should see the NIOS II IDE button. Click on it and it will launch. Make sure that your PTF file matches the PTF file that was generated for your project. As a project template, feel free to select Blank or Hello World Small. Now copy and paste the stuff from the template file into your newly created C file. Right click on your project, and go to system library properties, and set the timestamp timer to the interval timer (eg. timer_0) that you created in the SOPC. This is important to get the time stamp code working.
Now that you have things set up, you can go through the code and edit as neccessary. Right click on your project in the project window, and go to build. It should compile. When you are ready to run it, right click again and go to run on hardware. This will program your code onto the FPGA (assuming that you are still connected to it through Quartus II) and will display the results of the printf commands onto the console. You will be able to see your values and the time it took to run your function there. Note it is important that you programmed the FPGA and that you are connected to it currently (the "Click Cancel to stop using OpenCore Plus IP" window is up on Quartus II). If this isn't the case, then you will get errors if you even try to build your program.
You can also go into debug mode and set breakpoints and watchlists if you are having troubles with your program. Normally we could go into co-simulating this with both softward and hardware on Modelsim, but because there is nothing interesting, we will just skip this step. Once you are here you are finished with part 1.
Deliverables for part 1 then will be screen shot of your SOPC builder showing the SOC you have designed, your code files, and the console output as your outputs. Also convert the time stamp values and the result values to decimal from hex (just use a calculator).
Lab 2 Design Notes:
- Sometimes generating the SOPC, compiling the Quartus project, or building the C code in NIOS IDE will be really slow. Actually this is pretty much every time. It's a good idea to work the examples while you are going through the tutorials and video tutorials for this reason, since you will be waitiing between compiles.
- If you want to include the printf libraries, make sure you use printf small (alt_putstr from alt_stdio.h), because the normal stdio library is way too big to store on the FPGA.
- The memory addresses for your devices are under a file called system.h. You then can use them in your code by defining eg #define LEDs (volatile char *) 0x00021010.
- There are essentially three different tools you will be using: SOPC, Quartus, and NIOS II IDE. If during the course of design you change something in SOPC (for example, you increased the amount of on-board memory), you must then recompile it in Quartus, and then make sure you update the memory address of your devices, eg #define LEDs (volatile char *) 0x00021010. The actual address comes from the system.h file under your syslib. Only after you do all these steps can you reprogram the FPGA with your new Quartus II project and run your software off your FPGA.
- Make sure you only have one programmer window open. If you open multiple ones, they won't program the board and will sit there and stay at 0%. Also make sure you aren't currently running your software on the FGPA while you try and reprogram the board through Quartus, otherwise Quartus will complain and say nios2-terminal is using it.
- If you get errors while building your software in the IDE relating to "global pointer" it's probably because you have too little memory, or your code is too big. The FPGA can't store above 50k to 60kish and if you include something like the stdio library (which is a whopping 64k) then you can't possibly fit the program on the FPGA to run. (Well you could if you stored the program in the flash chip instead of the onboard memory, but we are trying to simplify this for you).
-The IP is time limited so you might have to reprogram your FPGA if your IDE complains about not finding anything there.
-Use the volatile keyword in front of your variables if they are values from memory mapped I/O!
-If you are in debug mode, you can switch back to IDE mode in the upper right hand corner. This is useful if you want to reprogram your software.
- You should clean and rebuild your project every time you add a new system header or function.
Lab 2 Useful links:
A tutorial on matrix multiplication
A page on matrix multiplication.
Matrix multiplication calculator.
Lab 1 Materials:
- Lab #1 Materials - VHDL Template File, VHDL TestBench File
- Here is a more
advanced TestBench file which will test your design for correctness and
give you the number of cycles it took for your design to complete
processing. Please use this TestBench to record the transcript file
needed for the lab documentation- Advanced
Test Bench File
- ALSO KEEP IN MIND THAT
YOU DO NOT WANT ANY LATCHES, OR COMBINATIONAL LOOPS
- Also, you do not need to include all of the output files in your final report just the relevant data requested
Lab 1 Resources:
Altera Quartus II webpage
- Main page for the Quartus II software. You can download the program from here (Quartus II Web Edition).
- Download ModelSim-Altera Starter Edition.
Quick tutorial on Quartus II
- This tutorial is for the CSE140 class, but provides a quick introduction to the design environment in Quartus. You will be doing this in VHDL and not schematic entry of course.
Quartus II simulation guide
- a good guide on running simulation in the native Quartus II simulator. Read through it to see how to properly implement input values for your simulation.
Quartus design lab tutorial
- this tutorial is from Colubmia univ and nicely summerizes some of the major steps you will do. Note - you will not utilize the DE2 board for lab 1.
Altera Quartus II tutorial
- tutorial from Altera's website.
VHDL Reference from Synopsys
- a decent but long reference to VHDL from Synopsys.
Lab 1 Instructions:
For this lab, you will use Quartus II and Modelsim-Altera to simulate your results. Install Quartus II first, then Modelsim. Make a new project and name it FSMAccumulator. Make sure your top level entity for this project is also FSMAccumulator. The top level entity declared here must match your desired top level entity that you have coded. You can change this in assignments->settings->general later if you have the incorrect name somehow.
Make a new VHDL file and copy and paste the code from the template file above into it. Save it as
FSMAccumulator.vhd. You can now compile it (processing->start compilation) and it should pass, abeit with many warnings. Work on it from there by filling out the rest of the VHDL code.
You will need to simulate your accumulator twice, once using vector waveforms with Quartus II's native simulator, and once using a testbench with Modelsim. Under Assignments->Settings->EDA Tool Settings -> Simulation, you can choose which simulation tool to use. When you are simulating the output with waveforms, use "none" which will default you to the native Quartus simulator. For waveform simulations, follow the steps in the above linked tutorial. Specify some initial values to start your accumulator, and set the clock properly.
When you want to use your test benches, then you must follow these steps: first, under tool name, select ModelSim-Altera. Next, under "NativeLink settings" click on Test Benches and make a new test bench. "Test bench name" is whatever you want to call it, eg "tester." Top level module needs to match the name of your test bench entity, eg "adder_testbench." Design instance name needs to match the name of the de-sign under test in your test bench file, eg "uut" or "u_adder." Finally, under Test bench files, add the test bench vhdl you have, eg "adder_testbench.vhd." To run your test bench, go to Tools -> Run EDA Simulation Tool -> Run RTL Simulation.
After compilation, you will also see a summary of the output, such as how many gates and flip flops are required to implement your design. Additionally, Quartus II comes with a netlist/RTL viewer that you can use to see what your VHDL implementation synthesis down to. Go to tools -> netlist viewers to take a look.
VHDL Reference Links :