InstructorSteven Swanson
Office: EBU3B 3212 Office Hours: Monday 2-3; Wed 10:30-11:30; by appointment UCSD homepage Teaching AssistantBryan S. Kim
Office: EBU3B B240A Office Hours: Tuesday 1pm-2pm; Thursday 3pm-4pm UCSD homepage Course discussion board: Google Groups. Required reading. Get signed up. |
This course will describe the basics of modern processor operation. Topics include computer system performance, instruction set architectures, pipelining, branch prediction, memory-hierarchy design, and a brief introduction to multiprocessor architecture issues.
Read assignments/paper summaries | 40% | |
Prefetching contest/in class presentation | 10% | |
Homework | 10% | |
Midterm | 20% | |
Final | 20% |
Additional notes about grades in this course:
Calculating grades: I compute grades using an Excel spread sheet. In the interests of transparancy, the current grade sheet (with identifying information removed) is available in XLS format. The grade sheet contains all the information about curves and how the grades are computed. It is somewhat sophisticated, if you find bugs please bring them to my attention. Please note that some versions of OpenOffice do not perform the calculations properly, and will give incorrect results.
The grading systems is based on a 13 point (F through A+) scale. For each assignment/test/etc, the sheet computes the letter grade (rounding up, when needed) according to a curve for each assignment (specified at the bottom of each assignments column). Your final grade is the weighted average of these grades.
We do our best to record grades accurately, but you should double-check.
Errors in grading: If you feel there has been an error in how an assignment or test was graded, you have one week from when the assignment is return to bring it to our attention. You must submit (via email to the instructor and the appropriate TAs) a written description of the problem.
For arithmetic errors (adding up points etc.) you do not need to submit anything in writing, but the one week limit still applies.
Final grades: If you have a problem with your
I will post the slides for most lectures. Since the slides contain material I am not allowed to distribute publically, they are password protected. I have posted the username and password to the web board.
Reading should be done before class on the day they are listed. It is essential that you do the readings. I will not cover everything you are responsible for in class.
Date | Topic | Readings | Slides | Due | Notes |
---|---|---|---|---|---|
Monday, September 26 | Introduction and Administrivia | 00_Intro.pdf | |||
Wednesday, September 28 | A Brief History of Architecture; CMOS/Technology scaling. | Optional (this is the original paper about Moore's Law): Cramming More Components Onto Integrated Circuits , , Proceedings of the IEEE 86(1):82-85, Jan 1998 |
01_Technology-1.pdf | ||
Monday, October 3 | Topic: The Variability Expeditions: Exploring the Software Stack for Underdesigned Computing Machines (For Google form: Paper 1) |
||||
Monday, October 3 | Performance measurement; Introduction to Caching |
03_performance.pdf, 02_Technology-2.pdf |
|||
Wednesday, October 5 | Student presentation: Advanced Caching (presented by Paul Wicks) | Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , , SIGARCH Comput. Archit. News 18(3a):364-373, 1990. (For Google form: Paper 2) Retrospective: improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , , ISCA '98: 25 years of the international symposia on Computer architecture (selected papers) , New York, NY, USA, 1998, pages 71-73. Trace cache: a low latency approach to high bandwidth instruction fetching , , MICRO 29: Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture , Washington, DC, USA, 1996, pages 24-35. (For Google form: Paper 3) |
04_Cache_intro.pdf, 04_Jouppi.pdf |
||
Monday, October 10 | Student presentation: Virtual memory and protection (presented by Utpal Kumar and Mohammad Moghimi) | Mondrian memory protection , , ASPLOS-X: Proceedings of the 10th international conference on Architectural support for programming languages and operating systems , New York, NY, USA, 2002, pages 304-316. (For Google form: Paper 4) Architecture support for single address space operating systems , , SIGPLAN Not. 27(9):175-186, 1992. (For Google form: Paper 5) Optional: Sharing and protection in a single-address-space operating system , , ACM Trans. Comput. Syst. 12(4):271-307, 1994. |
05_VirtualMemory.pdf, 06_SAOS.pdf, 06_SAOS_Questions.pdf |
||
Wednesday, October 12 | Student presentation: ISA design (presented by Tarun Arora and Phi Hung Nguyen) | The case for the reduced instruction set computer , , SIGARCH Comput. Archit. News 8(6):25-33, 1980. (For Google form: Paper 6) CryptoManiac: a fast flexible architecture for secure communication , , ISCA '01: Proceedings of the 28th annual international symposium on computer architecture Goteburg, Sweden, 2001, pages 110-119. (For Google form: Paper 7) Optional: Architectural support for fast symmetric-key cryptography, , Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, New York, NY, USA, 2000, pages 178-189 Optional: A VLSI RISC , , Computer 15(9): 8-21, Sep 1982. Optional: Very Long Instruction Word architectures and the ELI-512 , , ISCA '83: Proceedings of the 10th annual international symposium on Computer architecture , New York, NY, USA, 1983, pages 140-150. |
|||
Monday, October 17 | Slack | TBA |
07_MMP-Nooks.pdf, 07_ISAs.pdf, 07_MIPSExtensions.pdf |
||
Wednesday, October 19 | TBA | ||||
Monday, October 24 | Topic: Engineering Storage for the Data Age (For Google form: Paper 8) |
||||
Monday, October 24 | Pipelining; Student presentation: Branch Prediction (presented by Eugene Kolinko and Joon Lee) | A study of branch prediction strategies , , ISCA '81: Proceedings of the 8th annual symposium on Computer Architecture , Los Alamitos, CA, USA, 1981, pages 135-148. (For Google form: Paper 10) Retrospective: a study of branch prediction strategies , , ISCA '98: 25 years of the international symposia on Computer architecture (selected papers) , New York, NY, USA, 1998, pages 22-23. An analysis of correlation and predictability: what makes two-level branch predictors work , , ISCA '98: Proceedings of the 25th annual international symposium on Computer architecture , Washington, DC, USA, 1998, pages 52-61. (For Google form: Paper 9) |
08_Pipelining.pdf, 08_Hazards.pdf, 08_branchprediction.pdf |
||
Wednesday, October 26 | Student presentation: OOO execution (presented by Manoj Mardithaya and Pooja Saraff) | An efficient algorithm for exploiting multiple arithmetic units , , IBM J. Res. Dev. 11(1):25-33, 1967. (For Google form: Paper 11) HPSm, a high performance restricted data flow architecture having minimal functionality , , ISCA '86: Proceedings of the 13th annual international symposium on Computer architecture , Los Alamitos, CA, USA, 1986, pages 297-306. (For Google form: Paper 12) Retrospective: HPSm, a high performance restricted data flow architecture having minimal functionality , , ISCA '98: 25 years of the international symposia on Computer architecture (selected papers) , New York, NY, USA, 1998, pages 43-44. Optional: HPS, a new microarchitecture: rationale and introduction , , SIGMICRO Newsl. 16(4):103-108, 1985. Optional: A design space evaluation of grid processor architectures , , MICRO 34: Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture , Washington, DC, USA, 2001, pages 40-51. Optional: Excerpts from Design of a Computer: the Control Data 6600 Optional: Parallel Operation in the Control Data 6600 |
09_PrefetcherContest.pdf, 09_SuperScalarSMT.pdf, 09_OutOfOrderExecution.pdf |
Assignment 4; | |
Monday, October 31 | Student presentation: Dataflow (presented by Sreeparna Mukherjee and Margaret S. Urfer) | Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture, , Proceedings of the 30th annual international symposium on Computer architecture, New York, NY, USA, 2003, pages 422-433 (For Google form: Paper 14) Optional: Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture , , Micro, IEEE 23(6): 46 - 51, nov.-dec. 2003 The wavescalar architecture , , ACM Trans. Comput. Syst. 25(2):1-54, 2007. |
10_WaveScalar.pdf, 10_TRIPS.pdf, 10_MidtermReview.pdf |
||
Wednesday, November 2 | TBA | ||||
Monday, November 7 | TBA | ||||
Wednesday, November 9 | Student presentation: Multithreading (presented James Lue) | Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor , , ISCA '96: Proceedings of the 23rd annual international symposium on Computer architectur , New York, NY, USA, 1996, pages 191-202. (For Google form: Paper 16) Optional: Simultaneous multithreading: maximizing on-chip parallelism , , ISCA '95: Proceedings of the 22nd annual international symposium on Computer architecture , New York, NY, USA, 1995, pages 392-403 Optional: Speculative Versioning Cache , , Parallel and Distributed Systems, IEEE Transactions on, 12(12):1305-1317, Dec 2001 |
11_MultiScalar.pdf, 11_SMT.pdf |
||
Monday, November 14 | CMPs, coherence, and consistency |
15_CoherenceAndConsistency.pdf, 2011-Fall-CSE-240A-Midterm-v2-key.pdf |
|||
Wednesday, November 16 | Student presentation: CMPs (presented by German Alfaro and Linda Pescatore) | Niagara: A 32-Way Multithreaded Sparc Processor , , IEEE Micro 25(2):21-29, 2005. (For Google form: Paper 18) Optional: Sun's slides about the UltraSpark T2 (aka Niagara 2, aka Victoria Falls) Optional: Piranha: a scalable architecture based on single-chip multiprocessing , , ISCA '00: Proceedings of the 27th annual international symposium on Computer architecture , New York, NY, USA, 2000, pages 282-293 |
16_CaseForCMP.pdf, 16_NIAGARA.pdf |
||
Monday, November 21 | Topic: GreenDroid: An Architecture for the Dark Silicon Era (For Google form: Paper 19) |
||||
Monday, November 21 | Student presentation: Heterogeneity (presented by Erh-Li Shen and Ching-Yao Liu) | Conservation cores: reducing the energy of mature computations, , Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems, New York, NY, USA, 2010, pages 205-218 (For Google form: Paper 21) |
17_CCores.pdf, 17_Heterogeneous.pdf |
||
Wednesday, November 23 | 18-FlashOverview.pdf | ||||
Monday, November 28 | Student presentation: Storage (presented by Jyoti Wadhwani and Sudharsan Seshadri) | Transactional flash, , Proceedings of the 8th USENIX conference on Operating systems design and implementation, Berkeley, CA, USA, 2008, pages 147-160 (For Google form: Paper 22) FlashStore: high throughput persistent key-value store, , Proc. VLDB Endow. 3:1414-1425, September 2010 (For Google form: Paper 23) |
19_prefetcher_results.pdf, 19_FlashStore.pdf, 19_TXFlash.pdf |
||
Wednesday, November 30 | Final Review | TBA | cse240a_sample.pdf | ||
Thursday, December 8 | TBA |
fa11_cse240a_final.doc, fa11_cse240a_final.pdf |