CSE.240b Advanced Graduate Computer Architecture - Spring 2006      |
Course GoalsThis class is designed to enable students to follow the latest developments in computer architecture. Although this is clearly useful for those who wish to do research in computer architecture, it is also useful for those who work in related areas or who have general interests. The class strives for these goals through four aspects:
|
     |
|
April 11, 2006 | The course forum (located here) is up! |
April 11, 2006 | Details on the midterm are available below. |
April 23, 2006 | Section on journal entries added to website. |
May 1, 2006 | Assignment 1 posted. |
May 11, 2006 | Final Project Details posted. |
May 25, 2006 | Final Project: Project 1, Part II updated. |
Class (and forum) Participation | 25 % |
Assignments: | 25 % |
Project: | 30 % |
Final Exam: | 10 % |
Midterm Exam: | 10 % |
|
|
H&P | Computer Architecture: A Quantitative Approach, 3rd Ed., Hennessey and Patterson. |
RiCA | Readings in Computer Architecture, eds. Hill, Jouppi, and Sohi. |
Due         | Item | |||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Tu 4-4 | First Class. | |||||||||||||||||||||||||
Th 4-6 | Multiprocessors I H&P 6.1 - 6.5 | |||||||||||||||||||||||||
Monday 4-10 | Tuesday's class is rescheduled to 10:50 AM - 12:10 PM, Monday, April 10, in EBU-3B 1202.
Steve Swanson, a faculty candidate, will be talking on Wavescalar. Attendance is required unless you have a significant conflict. Please get an extra seat from the side, so that the class does not take up all of the seating. Please read the following papers in preparation for the talk: | |||||||||||||||||||||||||
A Preliminary Architecture for a Basic Data-Flow Processor, Jack Dennis et al., Proceedings of the International Symposium on Computer Architecture (ISCA) 1975. (RiCA) | ||||||||||||||||||||||||||
Wavescalar, Steven Swanson et al., Proceedings of the International Symposium on Microarchitecture (MICRO) 2003. | ||||||||||||||||||||||||||
Please write a short (1-2 page) analysis of Swanson's talk and research. Please describe its relationship to Jack Dennis's paper. | ||||||||||||||||||||||||||
Tu 4-11 | Class moved to Monday, April 10 at 10:50 pm, EBU-3b 1202. | |||||||||||||||||||||||||
Th 4-13 | Multiprocessors II H&P 6.6-6.8, 6.11, 6.13 | |||||||||||||||||||||||||
Tu 4-18 | Tiled Microprocessors I: Raw (Sashi, Donghwan) Tiled Microprocessors are interesting because they blur the boundaries between multiprocessors and microprocessors. | |||||||||||||||||||||||||
The Raw Microprocessor: A Computational Fabric for Software Circuits and General Purpose Programs, by Michael Bedford Taylor, Jason Kim, Jason Miller, David Wentzlaff, Fae Ghodrat, Ben Greenwald, Henry Hoffman, Jae-Wook Lee, Paul Johnson, Walter Lee, Albert Ma, Arvind Saraf, Mark Seneski, Nathan Shnidman, Volker Strumpen, Matt Frank, Saman Amarasinghe and Anant Agarwal. IEEE Micro, March/April 2002. | ||||||||||||||||||||||||||
Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams, by Michael Bedford Taylor, Walter Lee, Jason Miller, David Wentzlaff, Ian Bratt, Ben Greenwald, Henry Hoffmann, Paul Johnson, Jason Kim, James Psota, Arvind Saraf, Nathan Shnidman, Volker Strumpen, Matt Frank, Saman Amarasinghe, and Anant Agarwal. Proceedings of the International Symposium on Computer Architecture, June 2004. | ||||||||||||||||||||||||||
Space-Time Scheduling of Instruction-Level Parallelism on a Raw Machine, by Walter Lee, Rajeev Barua, Matthew Frank, Devabhaktuni Srikrishna, Jonathan Babb, Vivek Sarkar, and Saman Amarasinghe. Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), San Jose, CA, October 4-7, 1998. | ||||||||||||||||||||||||||
Write a 1-3 page analysis of the key ideas of the three papers. (Class Experts' Presentation) | ||||||||||||||||||||||||||
Th 4-20 | Tiled Microprocessors II: GRID/TRIPS and Wavescalar (Willis, Adam) We continue with Grid/TRIPS and Wavescalar, which extend Raw's distributed execution model with features found in out-of-order superscalar and dataflow processors. K. Sankaralingam will be speaking at UCSD in early May. | |||||||||||||||||||||||||
A Design Space Evaluation of Grid Processor Architectures, R. Nagarajan, K. Sankaralingam, D. Burger, and S.W. Keckler. 34th Annual International Symposium on Microarchitecture (MICRO), pp. 40-51, December, 2001. | ||||||||||||||||||||||||||
Scalar Operand Networks, by Michael Bedford Taylor, Walter Lee, Saman Amarasinghe, and Anant Agarwal. IEEE Transactions on Parallel and Distributed Systems (Special Issue on On-chip Networks), February 2005. | ||||||||||||||||||||||||||
Wavescalar, Steven Swanson et al., Proceedings of the International Symposium on Microarchitecture (MICRO) 2003. (We've already read this, but we will discuss this in class this day.) | ||||||||||||||||||||||||||
Write a 1-3 page analysis of the key ideas of the three papers. (Class Experts' Presentation) | ||||||||||||||||||||||||||
Fri 4-21 11 am | Please attend Onur Mutlu's job talk (location: EBU-3b 1202). Attendance is required unless you have a significant conflict. | |||||||||||||||||||||||||
Tu 4-25 | Tiled Discussion | |||||||||||||||||||||||||
Read ISCA 06 Wavescalar Paper. This describes what they actually did. | ||||||||||||||||||||||||||
The ISCA 06 Wavescalar Implementation TR may also help. (Optional) | ||||||||||||||||||||||||||
From the above, see if you can figure out Wavescalar's 5-tuple. | ||||||||||||||||||||||||||
Think of and post a unique discussion question on Raw/Grid/Wavescalar/SONs, on the forum, in the conference entitled Lecture 5. | ||||||||||||||||||||||||||
(Also think of how you would answer your own question and others.) | ||||||||||||||||||||||||||
Th 4-27 | Power4 (Richa, Todd) We examine Power4, which is state-of-the-art in many ways: wide-issue out-of-order superscalar, super-pipelined, dual-core, multi-chip module, etc. This is also the architecture that we will be programming, so knowledge of this paper is essential in the programming assignments. Although it is called "Power", Power4 is essentially a member of the PowerPC family (see manuals directory). In the readings, the goal is to get a high-level idea of the architecture and microarchitecture, but fairly in-depth understanding of shared-memory, coherence and consistency support in the architecture. | |||||||||||||||||||||||||
POWER4 system microarchitecture by J. M. Tendler, J. S. Dodson, J. S. Fields, Jr. H. Le, B. Sinharoy. IBM Journal of Research and Development, January 2002. | ||||||||||||||||||||||||||
As always, do a 1-3 page journal entry. (Class Experts' Presentation) | ||||||||||||||||||||||||||
Tu 5-2 | Power4 Manual (Vinoth, Arvindh) Read Book 2: 1.4, 1.7, skim 3.2.2, 3.3, skim 4, and Appendix B ("Programming Examples for Sharing Storage"). Read Book 3: 4.2.4. (You may have to read other sections to understand these sections.) If you are not familiar with PowerPC, you can refer to Book 1: User Instruction Set. Please read these sections carefully and make sure you understand well the examples in Appendix B. | |||||||||||||||||||||||||
As always, do a 1-3 page journal entry. (Class Experts' Presentation) (Synchronization and Consistency Presentation) | ||||||||||||||||||||||||||
Th 5-4 | Power4 continued (Todd, Arvindh) | |||||||||||||||||||||||||
Tu 5-9 | Shared Memory / Distributed Shared Memory (Anthony, Kwangyoon) | |||||||||||||||||||||||||
How to Make a Multiprocessor Computer that Correctly Executes Multiprocessor Programs, by L. Lamport. (RiCA) | ||||||||||||||||||||||||||
A New Solution to Coherence Problems in Multicache systems, by Censier and Feautrier. (RiCA) | ||||||||||||||||||||||||||
The Stanford Dash Multiprocessor, by Lenoski et al. (RiCA) | ||||||||||||||||||||||||||
As always, do a 1-3 page journal entry. (Class Experts' Presentation) | ||||||||||||||||||||||||||
Th 5-11 | ILP (Garo / Saturnino) | |||||||||||||||||||||||||
The MIPS R10000 Superscalar Microprocessor, by Yeager. IEEE Micro 1996. (also in RiCA) | ||||||||||||||||||||||||||
The Alpha 21264 Microprocessor, by Kessler. IEEE Micro 1999. | ||||||||||||||||||||||||||
The Microarchitecture of the Pentium 4, by Hinton et al. Intel Technology Journal, Q1 2001. | ||||||||||||||||||||||||||
Please focus your 1-3 page journal entry on analyzing the differences and similarities in the microarchitecture of the three systems. (Class Experts' Presentation) | ||||||||||||||||||||||||||
Tu 5-16 | Technology Trends (Mohammad Al-Fares / Jason Thurkettle) | |||||||||||||||||||||||||
Impact of Technology on Architecture, John H. Edmondson. From Design of High Performance Microprocessor Circuits, eds. Anantha Chandrakasan et al. | ||||||||||||||||||||||||||
Clock Rate versus IPC: The End of the Road for Conventional Microarchitectures. ISCA 2000. Agarwal, Hrishikesh, Keckler and Burger. | ||||||||||||||||||||||||||
(As always, do a 1-3 page journal entry.) (Class Experts' Presentation) | ||||||||||||||||||||||||||
Th 5-18 | Errors (Amelang / John Fish) | |||||||||||||||||||||||||
IBM experiments in soft fails in computer electronics (1978-1994) by Ziegler et al. IBM Journal Research and Development, January 1996. | ||||||||||||||||||||||||||
DIVA: A Reliable Substrate for Submicron Microarchitecture Design by Austin et al. Micro 1999. | ||||||||||||||||||||||||||
Razor: A Low-Power Pipeline Based on Circuit-Level timing Speculation by Ernst et al. Micro 2003. | ||||||||||||||||||||||||||
(As always, do a 1-3 page journal entry.) (Presentation) | ||||||||||||||||||||||||||
Tu 5-23 | IBM/SONY Cell (Also: experience with JSSC paper)Overview of the Architecture, Circuit Design, and Physical Implementation of a First-Generation Cell Processor, by Pham et al.
IEEE Journal of Solid-State Circuits, January 2006.
| The Microarchitecture of the Synergistic Processor for a Cell Processor, by Flachs et al.
IEEE Journal of Solid-State Circuits, January 2006.
| (As always, do a 1-3 page journal entry.) (Presentation)
| Th 5-25 | Vectors (Jennifer / Jeffrey) |
Krste Asanović, John Hennessy, David A. Patterson,
"Vector Processors", Appendix G in
Computer Architecture: A
Quantitative Approach, Third Edition,
Morgan Kaufman, ISBN 1-55860-596-7, May 2002.
PDF
| Ronny Krashinsky, Christopher Batten, Mark Hampton, Steven Gerding,
Brian Pharris, Jared Casper, and Krste Asanović,
"The Vector-Thread Architecture",
31st International Symposium on Computer Architecture
(ISCA-31), Munich, Germany, June 2004.
PDF
| (As always, do a 1-3 page journal entry.) (Class Presentation 1 and 2)
| Tu 5-30 | Interconnection Networks (Jin Seok Lee / Cezario Tebcherani) | A Survey of Wormhole Routing Techniques in Direct Networks, by Ni and McKinley. In IEEE Computer February 1993. (Ignore Figure 1, which is misleading.)
|
A Necessary and Sufficient Condition for Deadlock-
Free Adaptive Routing in Wormhole Networks
by Jose Duato. IEEE Transactions on Parallel and Distributed Systems, October 1995.
| (As always, do a 1-3 page journal entry.)
| Th 6-1 | Transactional Memory (Barath Raghavan) |
Transactional Memory: architectural support for lock-free data structures, by M.P. Herlihy and J.E.B. Moss.
International Symposium on Computer Architecture, May 1993.
|
Virtualizing Transactional Memory by R. Rajwar, M.P. Herlihy, and K. Lai.
International Symposium on Computer Architecture, June 2005.
| (As always, do a 1-3 page journal entry.)
| Tu 6-6 | Student Project Presentations (5 minutes per student) | (A few brief words on the final, Professor ..)
| Th 6-8 | Student Project Presentations (5 minutes per student) | Tu 6-13 | Final 11:30-2:30 See forum for more information on the final, including a list of some of the questions that will appear. | |