InstructorSteven Swanson
Office: EBU3B 3212 Office Hours: TBA UCSD homepage Teaching AssistantHung-Wei Tseng
Office: EBU3B B260A Office Hours: Tuesday 11:00a-12:00p, Wednesday 4:00p-5:00p, or by appointment UCSD homepage Course discussion board: WebCT. Required reading. Get signed up. |
This course will describe the basics of modern processor operation. Topics include computer system performance, instruction set architectures, pipelining, branch prediction, memory-hierarchy design, and a brief introduction to multiprocessor architecture issues.
3-4 homeworks, paper summaries | 20% | |
Prefetching contest/in class presentation | 15% | More details later. |
Two midterms | 35% | The midterms are on Jan. 28th and Feb 18th. |
Final | 30% | The final will be cummulative. |
Additional notes about grades in this course:
Calculating grades I compute grades using an Excel spread sheet. In the interests of transparancy, the current grade sheet (with identifying information removed) is avaiable in either XLS or PDF format. The grade sheet contains all the information about curves and how the grades are computed. It is somewhat sophisticated, if you find bugs please bring them to my attention. Please note that some versions of OpenOffice do not perform the calculations properly, and will give incorrect results.
The grading systems is based on a 13 point (F through A+) scale. For each assignment/test/etc, the sheet computes the letter grade (rounding up, when needed) according to a curve for each assignment (specified at the bottom of each assignments column). Your final grade is the weighted average of these grades.
We do our best to record grades accurately, but you should double-check.
Errors in grading If you feel there has been an error in how an assignment or test was graded, you have one week from when the assignment is return to bring it to our attention. You must submit (via email to the instructor and the appropriate TAs) a written description of the problem.
For arithmetic errors (adding up points etc.) you do not need to submit anything in writing, but the one week limit still applies.
Final grades If you have a problem with your
I will post the slides for most lectures. Since the slides contain material I am not allowed to distribute publically, they are password protected. I have posted the username and password to the web board.
Reading should be done before class on the day they are listed. It is essential that you do the readings. I will not cover everything you are responsible for in class.
Date | Topic | Readings | Slides | Due | Notes |
---|---|---|---|---|---|
Tuesday, January 5 | Introduction and Administrivia | 00_Intro.pdf | |||
Thursday, January 7 | A Brief History of Architecture; CMOS/Technology scaling. | Optional (this is the original paper about Moore's Law): Cramming More Components Onto Integrated Circuits, , Proceedings of the IEEE 86(1):82-85, Jan 1998 link. |
01_technology.pdf | ||
Tuesday, January 12 | Performance measurement; Introduction to Caching | 5.1-5.3 |
02_Technology.pdf, 03_performance.pdf, 04_Cache_intro.pdf |
||
Thursday, January 14 | Student presentation: Advanced Caching | Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers, , SIGARCH Comput. Archit. News 18(3a):364-373, 1990. Retrospective: improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers, Norman P. Jouppi, ISCA '98: 25 years of the international symposia on Computer architecture (selected papers), New York, NY, USA, 1998, pages 71-73. Trace cache: a low latency approach to high bandwidth instruction fetching, , MICRO 29: Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture , Washington, DC, USA, 1996, pages 24-35. |
05_Jouppi.pdf, 07_TraceCache.pdf |
||
Tuesday, January 19 | Virtual memory; Memory hierarchies | 06_VirtualMemory.pdf | Assignment 3-1; | ||
Thursday, January 21 | Student presentation (by Paul Loriaux)-- Variations and VM. | Architecture support for single address space operating systems, , SIGPLAN Not. 27(9):175-186, 1992. Optional: Sharing and protection in a single-address-space operating system, , ACM Trans. Comput. Syst. 12(4):271-307, 1994. |
loriaux_VM.pdf, Z1_Midterm1Preview.pdf |
||
Tuesday, January 26 | ISA Design | The case for the reduced instruction set computer, , SIGARCH Comput. Archit. News 8(6):25-33, 1980. Optional: A VLSI RISC, , Computer 15(9): 8-21, Sep 1982. Optional: Very Long Instruction Word architectures and the ELI-512, , ISCA '83: Proceedings of the 10th annual international symposium on Computer architecture, New York, NY, USA, 1983, pages 140-150. |
08_ISA.pdf, 09_CaseForRISC.pdf, AA_PrefetcherContest.pdf |
Assignment 3-2; | |
Thursday, January 28 | |||||
Tuesday, February 2 | Pipelining and Branch Prediction | A study of branch prediction strategies, , ISCA '81: Proceedings of the 8th annual symposium on Computer Architecture, Los Alamitos, CA, USA, 1981, pages 135-148. Retrospective: a study of branch prediction strategies, , ISCA '98: 25 years of the international symposia on Computer architecture (selected papers), New York, NY, USA, 1998, pages 22-23. |
10_BranchPrediction.pdf | ||
Thursday, February 4 | Student presentation(by Ilya Kolykhmatov and Ryan Gabrys) -- Advance branch prediction algorithms | Assigning confidence to conditional branch predictions, , MICRO 29: Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture, Washington, DC, USA, 1996, pages 142-152. Optional: Combining Branch Predictors, , technical report WRL-TN-36, 1993. Optional: Low-power, high-performance analog neural branch prediction, , MICRO '08: Proceedings of the 2008 41st IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, 2008, pages 447-458. |
11_BranchPapers.pdf | ||
Tuesday, February 9 | Introduction to OOO execution | The Alpha 21264 microprocessor, , Micro, IEEE 19(2):24-36, Mar/Apr 1999 |
15_OOO.pdf, 14_21264.pdf |
||
Thursday, February 11 | Student presentation (by Bryan Kim and Neha Chachra)-- Implementing OOO | Tomosulo Diagram An efficient algorithm for exploiting multiple arithmetic units, , IBM J. Res. Dev. 11(1):25-33, 1967. HPSm, a high performance restricted data flow architecture having minimal functionality, , ISCA '86: Proceedings of the 13th annual international symposium on Computer architecture, Los Alamitos, CA, USA, 1986, pages 297-306. Retrospective: HPSm, a high performance restricted data flow architecture having minimal functionality, , ISCA '98: 25 years of the international symposia on Computer architecture (selected papers), New York, NY, USA, 1998, pages 43-44. Optional: HPS, a new microarchitecture: rationale and introduction, , SIGMICRO Newsl. 16(4):103-108, 1985. Optional: A design space evaluation of grid processor architectures, , MICRO 34: Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, Washington, DC, USA, 2001, pages 40-51. Optional:Excerpts from Design of a Computer: the Control Data 6600 Optional: Parallel Operation in the Control Data 6600 |
13_OOO.pdf | ||
Tuesday, February 16 | Other execution strategies |
16_WaveScalar.ppt.pdf, Z2_Midterm2Preview.pdf |
Assignment 4; | ||
Thursday, February 18 | |||||
Tuesday, February 23 | |||||
Thursday, February 25 | Student presentation(by S.N. Hemanth Meenakshisundaram and Bharathan Balaji) -- Simultaneous (and other) multithreading | Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor, , ISCA '96: Proceedings of the 23rd annual international symposium on Computer architecture, New York, NY, USA, 1996, pages 191-202. Optional Simultaneous multithreading: maximizing on-chip parallelism, , ISCA '95: Proceedings of the 22nd annual international symposium on Computer architecture, New York, NY, USA, 1995, pages 392-403 Speculative Versioning Cache, , Parallel and Distributed Systems, IEEE Transactions on 12(12):1305-1317, Dec 2001 |
17_MultiscalarAndSMT.pdf | ||
Tuesday, March 2 | Multithreading |
18_Precompute.pdf, 19_DDT.pdf |
|||
Thursday, March 4 | Student presentation (by Kaisen Lin)-- Chip multiprocessors | Niagara: A 32-Way Multithreaded Sparc Processor, , IEEE Micro 25(2):21-29, 2005. Optional: Sun's slides about the UltraSpark T2 (aka Niagara 2, aka Victoria Falls) Optional:Piranha: a scalable architecture based on single-chip multiprocessing, , ISCA '00: Proceedings of the 27th annual international symposium on Computer architecture, New York, NY, USA, 2000, pages 282-293 |
20_CMPs.pdf | Project 1; | |
Tuesday, March 9 | Multiprocessors |
ProjectResults.pdf, 21_CMPs.pdf |
|||
Thursday, March 11 | Student presentation -- Support for determinism/TBA | DMP: deterministic shared memory multiprocessing, , ASPLOS '09: Proceeding of the 14th international conference on Architectural support for programming languages and operating systems, New York, NY, USA, 2009, pages 85-96 |
22_determinism.pdf | Assignment 5; | |
Thursday, March 18 | FinalPreview.pdf |