cse240a: Assignments


Homework Policy

Integrity Policy

Assignments

Assignment 1: Secret username

Due: by September 30, 2011

We would like to make all grading transparent and available as soon as possible. Because of this, we are requesting students to choose a secret username which will appear in grade sheets.

You will submit your secret username via forms of Google documents. Form can be found here Please do not make it obvious to identify you. In rare case of name conflicts, you will be notified to resubmit.

Deliverable

Your secret username submitted via the web form.

Due: by September 30, 2011

Assignment 2: Paper summaries

Due: 1 hour before every class meeting

This course is discussion-based, so it is essential that you read the papers and come to class prepared to discuss. The purpose of these summaries to ensure that you do so and provide some structure to how your approach the papers.

Reading research papers critically is an essential skill that you need to develop in graduate school. It will serve you will regardless of what kind of job you end up with. There have been several articles written about how to about this. As your first excercise (in this course, anyway) in reading and synthesizing several articles on a subject, take a look at these three: one, two, three.

You will submit your paper reviews via forms of Google documents. Form can be found here Please answer all questions thoroughly and thoughtfully.

You do not need to submit summaries for papers that you present as part of your presentation. This is the case even if you are presenting with a partner.

Deliverable

Your paper review submitted via the web form.

Due: 1 hour before every class meeting

Assignment 3: Class presentations

In place of the prefetcher project/contest you can prepare a presentation for one of the subjects we will be covering in class. There are several motivations for this:

  1. Preparing and presenting presentations is a very important skill. Ideas do not propogate themselves. If you want your ideas to have an impact, you must be able to convey them clearly to others.
  2. Preparing the presentation will give you in-depth exposure to one of the topics we cover. You should choose the topics you present to match your own interests
  3. By replacing the exams with presentations, you can decide where your work for this class occurs during the quarter. For instance, you can work around conference deadlines and exams in other classes.

In order to keep class on schedule, you will need to restrict your presentation to a total of 40 minutes. This includes some time taken for questions, so you should aim to prepare between 30 and 35 minutes of content. Staying within the time limit factors into the "presentation" portion of your grade for this assignment and it may impact other parts of it as well, if you do not have time to finish your presentation.

To help guide you in preparing your presentation to fit within the alotted time, please use the following breakdown as a guide in preparing your slides:

  1. Description of the paper's motivation (1-2 slides)
  2. Description of the papers' goals (1-2 slide)
  3. Discussion of their approach and results (5-10 slides)
  4. Discussion of related work (2-3 slides)
  5. Your thoughts on the paper (1-2 slides)
  6. Graphs and tables from the paper (as many slides as you need, but they are just "backup" in case someone has questions about them during class. You will not present these. You may, of course, also include these as needed in other sections as well, as needed.).

This will give a total of between 10 and 19 slides, in total, for both papers combined. That is not much space! This means that you will need to think carefully about what key points of the paper are. What are the main ideas they are trying to get across (i.e., the things you should present)? and what are the implementation details (i.e., which you should probably not present)? YOur ability to pick out the important parts is one of the things I'll be looking at when I review your slides.

Please keep in mind that following the above guidance for slide distribution does not ensure that you will present enough information or that you will fit within the time alotted. The only way to ensure that is to practice your presentation. I recommend very strongly that you do so.

I will also be looking at how well you provide background on your topic and incorporate information from the background reading I give you. Remember, your audience has already read the paper you are presenting. Your goal is to provide them with additional information about the paper, additional context for the paper, and to answer questions that they have. This means that you must become an expert on the paper you are presenting and the background material, even though you might not present all that material directly.

During your presentation, students will ask you questions. If you don't know the answer, you'll need to write it down a prepare an answer for the following class. Please limit the answers to at most one or two slides per question.

Put slide numbers on your slides!

Preparing this presentation is a big job and it's important that you do a good job so your presentation is useful to the other students. To help ensure that this is the case, the presentation is broken down into several parts spread out over several weeks. Making these deadline is part of your overall presentation grade.

A note about the deadlines. Once you have been assigned a day, it is your job to track the deadlines for the milestones that precede your presentation. I am not going to track you down asking to see your slides. If you do not get them to me by the deadline, your grade will reflect it.

Attendence during classes with a student presenter is extremely important (very nearly mandatory).

Important: The goal here is to provide you experience in becoming an expert on a topic. This is a difficult task, and this is a learning experience. If you get stuck looking for some key piece of information, come talk to me. I can probably point you in the right direction.

Part 1: Topic selection

Due: First week of class

Look at the course schedule and select a day/topic you would like to present. Assignments are first-come, first serve.

During class, I will let you know when you can send email (to the instructor and the TA) requesting a particular paper. Emails received before this time will not be accepted.

Deliverable

none.

Due: First week of class

Part 2: Meet with me to discuss your topic.

Due: Two weeks prior to your presentation

Make an appointment with me no more less than two weeks before your presenation day to discuss the topic. I will point you in the direction of extra readings, etc. that you should look at.

To ensure that we meet early enough, you should drop me a line about the meeting at least several days in advance.

Deliverable

none.

Due: Two weeks prior to your presentation

Part 3: Draft of your slides

Due: One week before your presentation

To prepare your slides, you should become an expert on the subject you are presenting. This means you should read papers in addition to the ones listed on the course web site and in addition to the ones I give to you during our meeting.

A goal to strive for is to be able to answer any question that someone might ask after reading the assigned papers. At the very least, this means you should have tried to find answers to all the questions that arose in your mind while reading those papers. You should also have good answers to most of the questions suggested in the guides to reading papers that you read at the beginning of the quarter.

Once you have become an expert on the subject, you need to condense all the information about the papers into about 45 minutes worth of slides. Typically, slides take about 1-2 minutes each. Use that value as a guide, but also practice presenting the slides to see if that guideline holds for you.

Spend time thinking about the slides and pay attention to the following issues:

Deliverable

Your slides in either PowerPoint (OpenOffice is fine), Keynote, or pdf format.

Due: One week before your presentation

Part 4: The Presentation

Due: The day of your presentation

You will present your presentation during the second half of class. As usual, we will collect a set of questions about the material that you will look into and present answers to during the next class meeting.

Deliverable

A copy of your final slides emailed to me.

Due: The day of your presentation

Part 5: Questions answered

Due: The class period following your presentation

Find answers to as many of the questions we raise in class as possible. This may mean reading some extra papers.

Deliverable

A copy of the slides covering the answers you present.

Due: The class period following your presentation

Assignment 4: Performance, Memory, ISA, VM, Pipelining, and Branch Prediction

Changelog

October 17 Updated equation in Question 5.
October 20 Additional questions posted (8-11)
October 22 Updated question 9 and 11.
October 31 Solution added

Problems

Due: October 26

You are strongly encouraged to work on this assignment in groups of two. If you do work in a group of 2, submit only 1 report (but make sure both names are on it). While you may have your partner do all the work, this will only hurt you when the midterm and final come around so don't do it.

  1. Anish, a graduate student, is designing a computer system that is going to process a set of 1000 30GB data sets from a near real-time climate simulation project that runs each day at the San Diego Supercomputing Center (SDSC). The job will start at 10pm and it must finish by 8am. Anish has two advisors who co-advise him: Helena and Rajesh. Helena explains that in designing the system to perform this computation, latency is critical. Rajesh, in a separate meeting, explains that Anish will need to pay close attention to system bandwidth. Explain why Helena and Roger are both right.
  2. For the following code, would a stream buffer or a victim buffer be more effective? Why?
    		      void foo(List *head) {
    		         List * cur = head;
    		         while(cur->next) {
    		            cur = cur->next;
    		         }
    		      }
    		      
  3. Assume my cache has 16 byte lines, 1024 lines, and is 2-way set associative. Integers are 4 bytes. Give the C code for a loop that has a very poor hit rate in this cache but whose hit rate raises to almost 100% if you double the number of lines to 2048.
  4. Consider the following code. Integers and pointers are both 4 bytes.
    		      struct List {
    		         List * next;
    		         int data;
    		      }
    
    		      void foo(List *head) {
    		         List * cur = head;
    		         while(cur->next) {
    		            cur = cur->next;
    		         }
    		      }
    		      
    For a given total cache size, what cache line size will provide the best performance for this code? (hint: Your answer should not depend on the number of lines or the associativity of the cache.)
  5. I like to evaluate computers using the following metric:
    ======= Awesomensess = (MB * CPUs * (MB/s of memory bandwidth)2)/(Watts * Dollars0.5)
    What aspects of a computer system do I value most highly? Which do I dislike the most?
    If system A costs twice as much as system B, how much additional memory bandwidth does A need to provide in order for A and B to be equally awesome?
  6. One goal of a single-address space operating system (SAOS) is to reduce the cost of sharing large in-memory objects between processes. In a conventional, multi-address space system sharing requires a copy from one address space to another. In a SAOS, no copy is required, but the system will need to do additional work to manage protection domains.

    If my application spends 45% of its time copying data to support sharing, what's the maximum possible speedup that the SAOS can provide for my application? and why?

  7. In the text book you read about the benefits of a virtually indexed, physically tagged cache. The SAOS papers adds a PLB to the system as well.

    In order to minimize the cost of PLB misses, you would like to have a very, very large PLB. The access to such a PLB will be slower than the L1 access time. How can you add this large PLB to the system without impacting L1 access time while still ensuring that the currently running process cannot access any data it does not have permissions for?

    Draw a block diagram including the processor, the TLB, the PLB, and the L1 cache to illustrate your solution. Including any other changes required to the system to make it work.

  8. Why is branch prediction accuracy so much more important in deeply pipelined processors?
  9. The processor design that you are working on currently has 97% instruction cache hit ratio, 90% data cache hit ratio, and 80% branch prediction accuracy. Given a workload of 1 branch in every 5 instruction and 1 memory access in every 3 instructions, which portion of the hardware should you focus to improve: instruction cache's hit ratio, data cache's hit ratio, or branch predictor's accuracy? Say that the penalty for L1 cache misses and branch mispredictions are both 5 cycles.
  10. J. E. Smith's study on branch prediction introduces saturating counter based branch prediction. Experimental results show that 2-bit counters outperform 3-bit counters due to program inertia. Provide a code sample for which a 3-bit counter performs better and another for which a 2-bit counter is better.
  11. For each labeled branch in the code below (except the backward branch on the outer loop) which of the following branch predictors will perform best.
    		for (j = 0; j < 100; j++) {
    			for (i = 0; i < 100;i++) {
    
    		                // rand3() returns 1,2, or 3 with equal probabilty
    				if(rand3() <= 1) { // A
    				}
    
    				if(i % 3 == 0) {// B
    				}
    
    				if (i % 2 == 0) {// C
    				}
    
    				if(i % 6 ==0) {// D
    				}
    
    				if (i < 4) { // E
    				}
    
    			}  // F
    		}
    			

Deliverable

Place your typed (or well written) solutions to the problems in Bryan's campus mailbox (room 2237 of the CSE building) or hand-in to Bryan before 4pm.

Due: October 26

Assignment 5: Multi-threading, CMPs, Heterogeneity, and Storage (AKA: compulsory studying for the final)

Problems

Due: December 2

You are strongly encouraged to work on this assignment in groups of two. If you do work in a group of 2, submit only 1 report (but make sure both names are on it). While you may have your partner do all the work, this will only hurt you when the midterm and final come around so don't do it.

  1. All the threads in an SMT processor share a single L1 data cache and a single L1 instruction cache. How could this potential hurt performance of the threads running on the processor? How could it help? Suggest a way that the architecture could limit the damage and maximize the gain. Would you expect the benefits to larger or smaller for the instruction cache or the data cache? Why?
  2. Consider the following code, and assume that integers are 8 bytes and that ThreadA() and ThreadB() are run on different processors. You notice that when ThreadA() runs alone, it can update A 1billion times per second. When ThreadB() runs alone, it increments B 1Billion times per second. However, when they simultaneously on different processors, they each only manage to increment their variable 100 million times per second. Why? How would you fix it?
    	int A;
    	int B;
    	
    	void ThreadA() {while (1) {A++;}}
    	void ThreadB() {while (1) {B++;}}
    	
  3. Imagine a web serving data center that serves billions of web requests per day. What kind of processor architecture would be the best fit for such an application? why? (This question is intentionally opened ended. State your assumptions and your reasoning.)
  4. Both TXflash and FlashStore provide hardware support for particular applications, much as cryptomaniac and c-cores do. How is the motivation for TXFlash and FlashStore different than the motivation for those other projects? How is it the same? Do you think that TXFlash or FlashStore has a better or worse chance of being widely adopted than c-cores? Why?
  5. The single-ISA heterogeneous multiprocessor paper and the conservation cores paper take very different approaches to specialization. Give 3 advantages of each relative to the other.

Deliverable

Place your typed (or well written) solutions to the problems in Bryan's campus mailbox (room 2237 of the CSE building) or hand-in to Bryan before 4pm.

Due: December 2