cse.240b Parallel Computer Architecture - Spring 2014


Course Goals

This class is designed to enable students to follow the latest developments in computer architecture, especially those related to parallel computer architecture. Although this is clearly useful for those who wish to do research in computer architecture, it is also useful for those who work in related areas or who have general interests. The class strives for these goals through four aspects:
  1. Cover a broad swath of material related to parallel computer architecture, not covered in core-level grad classes.

  2. Providing students with the opportunity to
    { find, analyze, communicate, discuss } advanced material.

  3. Examining fundamental ideas that are the "frontier" of the field and have not yet made it into industry.

  4. Prepare students to lead discussion of topics and filter/assimiliate information quickly.

[Michael Taylor]
Prof. Michael Taylor


April 2The course forum is up! Make sure to sign up in order to receive important course details. Click here to join. You must give your name as your nickname, and enter in your UCSD email address in the information box. It may take a day or two for you to be comfirmed.
April 2Yes, one analysis per paper! No analysis for textbook items UNLESS specified below.

Course Materials

The class will consist of readings generally found in the following locations:
  1. IEEE Explore (free access from UCSD network)
    For IEEE publications.

  2. ACM Portal (free access from UCSD network)
    For ACM publications.

  3. Computer Architecture: A Quantitative Approach, Hennessy & Patterson.
    Hopefully, you already have this.

  4. The Synthesis Lectures on Computer Architecture. (Free, when accessed from UCSD campus.)

    I will not post links to the articles, because I want located these papers to become second nature to you.

    If you do not have access to the UCSD network because you are an open enrollment student, then make a friend in the class to help.


I expect a high level of work quality and independence in this class, since it is an advanced graduate class. Participation in in-class discussions is an integral part of the class, and comprises a significant component of the class participation grade

In true computer architecture form, your Spec240B number (also known as your grade!) includes the multiplication function:
Final Grade = Proof of Reading   *  

Class Participation 20 %
Mini Research Exam (Oral) 25 %
Assignment(s) 15%
Final Paper (Paper) 20 %
Final Exam 20 %
(subject to change as class unfolds)

Proof of Reading

Since much of the class will consist of discussions, it is absolutely essential that you do the reading BEFORE class. There is no greater waste of everybody's time than a discussion class where nobody has read the paper. To help keep the quality of class high, we will collect responses to a set of questions for each paper via two links.

Final Paper

Each student will pick one of the topics that they chose for their oral exam.
They will write a research exam, in the style of the CSE department,
on a sub-topic of this paper. This topic should
be a researchy topic with recent work in the area; i.e. there should be some recent
papers in the last three years that are "opener" papers rather than "closer" papers.

The research exam will have the attributes of a "creative survey". A study list will be defined by the student and the research exam committee. The student is expected to survey the area, including recent developments, identify key themes, and observe open/future directions.
I expect that you will cite at least 20-30 relevant papers for this paper, and in meaningful ways.

The paper will be due via email (.pdf only) by the time of the UCSD scheduled final.


The mini-research and last test will test different things. The mini-research exams will test your ability to really understand a subject in depth. The last test will cover the breadth of your knowledge of the material in the class. This material will include both the reading and topics that come up in class discussion. Some of these topics will almost certainly not be in the reading. Generally speaking, the last test will be fairly easy if you have done the reading carefully, and participated in the in-class discussion, and jotted down a few keywords to remind yourself what to study (i.e. via internet source) later.

Mini Research Exams

Each student's "mini research exam" consists of presenting (possibly in conjunction with other students) the material that was assigned for a particular day.

This will mirror, to some degree, the research exam that PhD students have to do after their second year in the PhD program. Each day, two students take responsibility for being the "class experts" for a particular set of material that we read. They will give a 2x40 minute presentation on the material (roughly 40 minutes each). This presentation will motivate the problem the reading is trying to solve, and present the key ideas, using the IMD (ideas, mechanisms, dinosaurs) framework, and propose future directions or questions. Be sure to go through the key architecture mechanisms proposed in the reading in detail.
email: mbtaylor at you see ess dee dot ee dee you
web:   Michael Taylor's Website.

NOTE: Schedule is highly subject to change.
Wed, April 02 Overview, Administrivia
(Epoch 2)
Wed, April 02
Wed, April 09 Tech Trends A Landscape of the New Dark Silicon Regime, Taylor, IEEE Micro 2013.

Asanovic et al, "The landscape of parallel computing research: a view from berkeley", Tech Report UCB/EECS2006-183.

Presenter: MBT
(Epoch 2)
Wed, April 09
Case Study: Raw, a Simple Parallel Machine The Raw Microprocessor: A Computational Fabric for Software Circuits and General Purpose Programs
Taylor et al, IEEE Micro March/April 2002.

The Raw Specification, v 5.0, The Raw Specification, v 5.0. Presenter: MBT
Wed, April 16 Cache Coherence I Read "A Primer on Memory Consistency and Cache Coherence", Chap 1-4.

Presenter: TBA
(Epoch 2)
Wed, April 16
Cache Coherence, Part II Read "A Primer on Memory Consistency and Cache Coherence", Chapter 6-8.

Presenter: TBA
Wed, April 23 No class Work on your paper.
(Epoch 2)
Wed, April 23
No class Work on your paper.
Wed, April 30 Data Parallel H & P Chapter 4 (Data-Level Parallelism in Vector, SIMD and GPU). Presenter: TBA
(Epoch 2)
Wed, April 30
GPUs/CUDA & Tera NVIDIA Tesla: A Unified Graphics and Computing Architecture Lindholm, et al. Micro, IEEE Mar 2008.

Scalable Parallel Programming with CUDA. Nickolls et al. ACM Queue. 2008.

TBD: AMD architecture reference and comparison
TBD:Something on FERMI

Presenter: TBA
Wed, May 07 Data Center Read H&P Chapter 6. Presenter: TBA
(Epoch 2)
Wed, May 07
Data Center II Read Synthesis Lecture: "The Datacenter as a Computer" Second Ed. Presenter: TBA
Wed, May 14 Heterogeneity Conservation Cores, ASPLOS 2010, Venkatesh.

The GreenDroid Mobile Application Processor: An Architecture for Silicon's Dark Future, IEEE Micro 2011, Goulding-Hotta.

Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction, MICRO 2003, Tullsen (Presenter: TBA)
(Epoch 2)
Wed, May 14
Heterogeneity II ISCA 2010; Horowitz; Understanding Sources of Inefficiency in General-Purpose Chips

The Convolution Engine, Horowitz, ISCA 2013.

DySER: Unifying Functionality and Parallelism Specialization for Energy-Efficient Computing, IEEE Micro 2012. (Presenter: TBA)
Wed, May 21 Parallelizing Compilers Read Synthesis Lecture on "Automatic Parallelization: An Overview of Fundamental Compiler Techniques", (Chapter 1-3. Presenter: TBA)
(Epoch 2)
Wed, May 21
Parallelizing Compilers II (Chapter 4-7)

Presenter: TBA
Wed, May 28 On-Chip Networks Reading: Pattern and Hennessey Appendix F
Springer chapter; Access from campus

Presenter: TBA
(Epoch 2)
Wed, May 28
FPGAs Xilinx Series 7
Virtex-7 User Guide
Hotchips Presentation
Virtex-7 Memory
Virtex-7 memory

Presenter: TBA
Wed, June 04 No Class Work on your paper.
(Epoch 2)
Wed, June 04
No Class Work on your paper.