cse240a: Graduate Computer Architecture

Pepper Canyon Hall 122
Lectures Tuesday & Thursday, 8:00am-9:20am (Pepper Canyon Hall 122)
Spring, 2014
Shortcuts: Schedule Homeworks Projects


Steven Swanson
Email: swanson @ cs.ucsd.edu
IM (not email): professorswanson@{AIM, Yahoo!, google talk, MS Messenger}
Office: EBU3B 3212
Office Hours: Wed 12-1; Thurs 3-4; By appointment
UCSD homepage

Teaching Assistants

Sriskanda Shamasunder
Email: sshamasu @ cs.ucsd.edu
Office: EBU3B B275A
Office Hours: Wed 1pm-3pm

Usha Subburaj
Email: usubbura @ cs.ucsd.edu
Office: CSE B250A
Office Hours: Mon 4.30-6.30

Course discussion board: Piazza. Required reading. Get signed up.

Course Description

This course will describe the basics of modern processor operation. Topics include computer system performance, power issues, memory, multiprocessors, pipelining, instruction-level parallelism, storage systems, GPUs, and virtual machines.

Text books

Required: Patterson & Hennessy, Computer Architecture: A Quantitative Approach, 5th Edition, Patterson & Hennessy, Morgan Kaufmann, 5th Edition I refer to it as "P+H" below.
Required: Synthesis Lectures on Computer Architecture You'll need to be on campus to access these, but they are free. I refer to them as SLOCAs below.
Required: Other assigned readings throughout the quarter.
Optional: The History of Computing This a great set of lectures from a course taught at UCSD/UW/Berkeley several years ago. Most of them are by the folks that actually made the history (Steve Wozniak, Ray Ozzie, Gordon Bell, etc.).


Class participation 15%
Assignments 20%
Prefetching contest 15%
Midterm 25% The midterm is on 6th May, 2014.
Final 25% The final will be cummulative.

Additional notes about grades in this course:


I will post the slides for most lectures.

Reading should be done before class on the day they are listed. It is essential that you do the readings.

Date Topic Readings Slides Due Notes
Tuesday, April 1 Introduction and Administrivia 00_Introduction_plus_logistics.pdf
Thursday, April 3 Silicon Scaling Appendix A (if you architecture is rusty); P+H 1.1-1.12
(this is the original paper about Moore's Law): Cramming More Components Onto Integrated Circuits, G.E. Moore, Proceedings of the IEEE 86(1):82-85, Jan 1998 link.
Tuesday, April 8 Measuring and Thinking About Performance; Power SLOCA: "COMPUTER ARCHITECTURE TECHNIQUES FOR POWER-EFFICIENCY" 1-1.4; 2-2.2.2, 3-3.2.1, 4-4.2.1, 5.0-5.1.1;
Conservation cores: reducing the energy of mature computations, Ganesh Venkatesh, Jack Sampson, Nathan Goulding, Saturnino Garcia, Vladyslav Bryksin, Jose Lugo-Martinez, Steven Swanson, and Michael Bedford Taylor, Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems, New York, NY, USA, 2010, pages 205-218
Thursday, April 10 More on power Same as last time 02_Amdahls_Law.pdf
Tuesday, April 15 Instruction Sets Review Appendix A, if you need a refresher on ISAs.
The case for the reduced instruction set computer, David A. Patterson and David R. Ditzel, SIGARCH Comput. Archit. News 8(6):25-33, 1980.
CryptoManiac: a fast flexible architecture for secure communication, Lisa Wu, Chris Weaver, and Todd Austin, ISCA '01: Proceedings of the 28th annual international symposium on Computer architecture, New York, NY, USA, 2001, pages 110-119 link.
Thursday, April 17 Memory Hierarchy H+P: B.1-B.3, 2.1-2.2
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers, Norman P. Jouppi, SIGARCH Comput. Archit. News 18(3a):364-373, 1990.
Retrospective: improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers, Norman P. Jouppi, ISCA '98: 25 years of the international symposia on Computer architecture (selected papers), New York, NY, USA, 1998, pages 71-73.
Tuesday, April 22 Processor Pipelines and Out-of-Order Execution Appendix C : Pages C1-34, 51-58 : I strongly recommend reviewing this if not familiar with pipelining
Chapters 3.1, 3.4-3.6, 3.8
An efficient algorithm for exploiting multiple arithmetic units, R.M. Tomasulo , IBM J. Res. Dev. 11(1):25-33, 1967.
Thursday, April 24 No class TBA
Tuesday, April 29 Processor Pipelines and Out-of-Order Execution Same as last time 04_Pipelining_annotated.pdf,
Thursday, May 1 Midterm Review
Tuesday, May 6 Midterm Exam 8:00am - 9:20am
Thursday, May 8 Chip Multiprocessors H+P Chapters 5.1-5.4
SLOCA: Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency 1.1 -1.4
Tuesday, May 13 Chip Multiprocessors Same as last time
Thursday, May 15 Multiprocessor memories H+P Chapters 5.5-5.10
SLOCA: A Primer on Memory Consistency and Cache Coherence. Chapter 1-3.7; 5.1-5.2.2
Tuesday, May 20 Multithreading SLOCA: Multithreading Architecture Chapter 1-5
Thursday, May 22 Branch Prediction Appendix C: pages C-21 to C-30 ( If you didn't read it earlier)
Combining Branch Predictors,McFarling, Scott,Tech. Rep. TN-36m, Digital Western Research Laboratory ,June 1993
Tuesday, May 27 Storage TBD
00_Academic Honesty.pdf,
Thursday, May 29 Storage + potpourri TBD 20_Storage_2.pdf
Tuesday, June 3 GPU H+P 4.1 , 4.4-4.7
18_GPUs.pdf Project 1;
Thursday, June 5 Final Review
Thursday, June 12 Final Exam 8:00am - 10:59am

Integrity Policy


Assignment 1: In the words of Jeremy Clarkson - "POWERRR!"
Assignment 2: Cache me if you can
Assignment 3: To branch or not to branch


Project 1: Prefetching competition