cse240c: Advanced Microarchitecture

Warren Lecture Hall 2204
Lectures Tue. & Thu., 8:00-9:20
Spring, 2013
Shortcuts: Schedule Homeworks

Instructor

Steven Swanson
Email: swanson @ cs.ucsd.edu
IM (not email): professorswanson@{AIM, Yahoo!, google talk, MS Messenger}
Office: EBU3B 3212
Office Hours: Monday 10-11; Thursday 11-12 (and by appointment)
UCSD homepage


Course Description

This course will cover advanced topics in processor microarchitecture. We will cover both the "latest and greatest" as well as the "oldies but goodies" in both commercial processors and architecture research. We will learn answers to questions like:

Please acquaint yourself with the auxillary web page. It has the link to the paper review site and the slide repository.


Text books

Required: Assigned readings throughout the quarter. See the schedule below

Grading

Note that 66% of you grade is determined by preparing for and participating in class.

Paper summaries 33% You will summarize each paper we read in class. Summaries are due 20 minutes before class begins. No exceptions. This means there is no reason to be late for class to complete your summary.
Class participation 33% This class is discussion driven, so must come prepared to discuss the material. This includes showing up. You should contribute to the discussion very nearly every day.
In class presentations 33% In lieue of exams, each of you prepare and present two presentations on topics we will cover.

Schedule

Items in the schedule more that one week in the future are subject to change. Check back for updates for the assigned readings, etc. Deadlines for homeworks/projecsts that have been assigned be not be moved earlier.

I will post the slides for the lectures once I receive them from the presenter.

Date Topic Readings Slides Due Notes
Tuesday, April 2 Administrivia. 00_Intorduction.pdf
Thursday, April 4 Historical perspectives Paper 1: Cramming More Components Onto Integrated Circuits, G.E. Moore, Proceedings of the IEEE 86(1):82-85, Jan 1998 link.

Paper 2: The history of the microcomputer-invention and evolution, S. Mazor, Proceedings of the IEEE 83(12):1601-1608, Dec 1995 link.

Additional readings if you are interested:
A History of Microprocessor Development at Intel, R.N. Noyce and M.E. Hoff, Micro, IEEE 1(1):8 -21, feb. 1981 link.

A 4096-bit dynamic MOS RAM, J. Karp, W. Regitz, and S. Chou, Solid-State Circuits Conference. Digest of Technical Papers. 1972 IEEE International XV: 10-11, Feb 1972 link.

A three transistor-cell, 1024-bit, 500 NS MOS RAM, W. Regitz and J. Karp, Solid-State Circuits Conference. Digest of Technical Papers. 1970 IEEE International XIII: 42-43, Feb 1970 link.

Design of ion-implanted MOSFET's with very small physical dimensions, R.H. Dennard, F.H. Gaensslen, V.L. Rideout, E. Bassous, and A.R. LeBlanc, Solid-State Circuits, IEEE Journal of 9(5): 256-268, Oct 1974 link.

The future of wires, R. Ho, K.W. Mai, and M.A. Horowitz, Proceedings of the IEEE 89(4):490-504, Apr 2001 link.

A whole issue of IEEE Solid-State Circuits Society News about Dennardian Scaling. link.
Tuesday, April 9 Historical perspectives Paper 3: Architecture of the IBM System/360, G. M. Amdahl, G. A. Blaauw, and Jr. F. P. Brooks, :17-31, 2000 link.

Paper 4: Parallel operation in the control data 6600, James E. Thornton, :5-12, 1995 link.

Additional readings if you are interested:
Design of a Computer -- The Control Data 6600, James E. Thornton, link.

Considerations in Computer Design - Leading up to the Control Data 6600, James E. Thornton, , 1963 link.

IBM's 360 and early 370 systems, Emerson Pugh, Lyle R. Johnson, and John H. Palmer MIT Press, 1991.
Thursday, April 11 Vectors Paper 5: The CRAY-1 computer system, Richard M. Russell, Commun. ACM 21(1):63-72, 1978 link.

Paper 6: Tarantula: a vector extension to the alpha architecture, R. Espasa, F. Ardanaz, J. Emer, S. Felix, J. Gago, R. Gramunt, I. Hernandez, T. Juan, G. Lowney, M. Mattina, and A. Seznec, Computer Architecture, 2002. Proceedings. 29th Annual International Symposium on:281-292, 2002 link.

Additional readings if you are interested:
An analysis of the Cray-1 computer, Richard L. Sites, ISCA '78: Proceedings of the 5th annual symposium on Computer architecture, New York, NY, USA, 1978, pages 101-106 link.

Tuesday, April 16 Dark Silicon Computational sprinting on a hardware/software testbed, Arun Raghavan, Laurel Emurian, Lei Shao, Marios Papaefthymiou, Kevin P. Pipe, Thomas F. Wenisch, and Milo M.K. Martin, Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems, New York, NY, USA, 2013, pages 155-166 link.

Conservation cores: reducing the energy of mature computations, Ganesh Venkatesh, Jack Sampson, Nathan Goulding, Saturnino Garcia, Vladyslav Bryksin, Jose Lugo-Martinez, Steven Swanson, and Michael Bedford Taylor, Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems, New York, NY, USA, 2010, pages 205-218 link.
Thursday, April 18 Unconventional OOO exeuction Paper 7: The WaveScalar architecture, Steven Swanson, Andrew Schwerin, Martha Mercaldi, Andrew Petersen, Andrew Putnam, Ken Michelson, Mark Oskin, and Susan J. Eggers, ACM Trans. Comput. Syst. 25(2):4, 2007 link (Sections 1-4 only).


Paper 8: Distributed Microarchitectural Protocols in the TRIPS Prototype Processor, Karthikeyan Sankaralingam, Ramadass Nagarajan, Robert McDonald, Rajagopalan Desikan, Saurabh Drolia, M. S. Govindan, Paul Gratz, Divya Gulati, Heather Hanson, Changkyu Kim, Haiming Liu, Nitya Ranganathan, Simha Sethumadhavan, Sadia Sharif, Premkishore Shivakumar, Stephen W. Keckler, and Doug Burger, Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, 2006, pages 480-491 link.

Additional readings if you are interested:
Critical issues regarding HPS, a high performance microarchitecture, Y. N. Patt, S. W. Melvin, W. M. Hwu, and M. C. Shebanow, SIGMICRO Newsl. 16(4):109-116, 1985 link.

First version of a data flow procedure language, J. B. Dennis, Programming Symposium, Proceedings Colloque sur la Programmation, London, UK, 1974, pages 362-376.

Tuesday, April 23 Reliability Paper 9: Transient fault detection via simultaneous multithreading, Steven K. Reinhardt and Shubhendu S. Mukherjee, SIGARCH Comput. Archit. News 28(2):25-36, 2000 link.

Paper 10: DIVA: a reliable substrate for deep submicron microarchitecture design, Todd M. Austin, MICRO 32: Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture, Washington, DC, USA, 1999, pages 196-207 link.
Additional readings if you are interested:
A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor, Shubhendu S. Mukherjee, Christopher Weaver, Joel Emer, Steven K. Reinhardt, and Todd Austin, MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, 2003, page 29 link.

Thursday, April 25 Multiprocessors Paper 15:The DASH prototype: implementation and performance, Daniel Lenoski, James Laudon, Truman Joe, David Nakahira, Luis Stevens, Anoop Gupta, and John Hennessy, 25 years of the international symposia on Computer architecture (selected papers), New York, NY, USA, 1998, pages 418-429 link.

Retrospective: the DASH prototype: implementation and performance, Daniel E. Lenoski and James P. Laudon, 25 years of the international symposia on Computer architecture (selected papers), New York, NY, USA, 1998, pages 80-82 link.

Paper 16: The MIT Alewife machine: architecture and performance, Anant Agarwal, Ricardo Bianchini, David Chaiken, Kirk L. Johnson, David Kranz, John Kubiatowicz, Beng-Hong Lim, Kenneth Mackenzie, and Donald Yeung, Proceedings of the 22nd annual international symposium on Computer architecture, New York, NY, USA, 1995, pages 2-13 link.

Retrospective: the MIT Alewife machine: architecture and performance, Anant Agarwal, 25 years of the international symposia on Computer architecture (selected papers), New York, NY, USA, 1998, pages 103-110 link.

Tuesday, April 30 Circuit-level microarchitectural issues Paper 11: ReCycle:: pipeline adaptation to tolerate process variation, Abhishek Tiwari, Smruti R. Sarangi, and Josep Torrellas, ISCA '07: Proceedings of the 34th annual international symposium on Computer architecture, New York, NY, USA, 2007, pages 323-334 link.

Paper 12: Razor: a low-power pipeline based on circuit-level timing speculation, D. Ernst, Nam Sung Kim, S. Das, S. Pant, R. Rao, Toan Pham, C. Ziesler, D. Blaauw, T. Austin, K. Flautner, and T. Mudge, Microarchitecture, 2003. MICRO-36. Proceedings. 36th Annual IEEE/ACM International Symposium on: 7-18, Dec. 2003 link.

Thursday, May 2 Interconnects Paper 13: Anatomy of a message in the Alewife multiprocessor, John Kubiatowicz and Anant Agarwal, Proceedings of the 7th international conference on Supercomputing, New York, NY, USA, 1993, pages 195-206 link.

Paper 14: Virtual-channel flow control, William J. Dally, Proceedings of the 17th annual international symposium on Computer Architecture, New York, NY, USA, 1990, pages 60-68 link.

Tuesday, May 7 Data Centers The Case for Energy-Proportional Computing, Luiz Andre Barroso and Urs Holzle, Computer 40(12):33-37, December 2007 link.

Web Search for a Planet: The Google Cluster Architecture, Luiz Andre Barroso, Jeffrey Dean, and Urs Holzle, IEEE Micro 23(2):22-28, March 2003 link.
Thursday, May 9 Specialized architectures Paper 17: Anton, a special-purpose machine for molecular dynamics simulation, David E. Shaw, Martin M. Deneroff, Ron O. Dror, Jeffrey S. Kuskin, Richard H. Larson, John K. Salmon, Cliff Young, Brannon Batson, Kevin J. Bowers, Jack C. Chao, Michael P. Eastwood, Joseph Gagliardo, J. P. Grossman, C. Richard Ho, Douglas J. Ierardi, Istvan Kolossvary, John L. Klepeis, Timothy Layman, Christine McLeavey, Mark A. Moraes, Rolf Mueller, Edward C. Priest, Yibing Shan, Jochen Spengler, Michael Theobald, Brian Towles, and Stanley C. Wang, SIGARCH Comput. Archit. News 35:1-12, June 2007 link.

Paper 18: CryptoManiac: a fast flexible architecture for secure communication, Lisa Wu, Chris Weaver, and Todd Austin, ISCA '01: Proceedings of the 28th annual international symposium on Computer architecture, New York, NY, USA, 2001, pages 110-119 link.

Additional readings if you are interested:
Evaluating the Imagine Stream Architecture, Jung Ho Ahn, William J. Dally, Brucek Khailany, Ujval J. Kapasi, and Abhishek Das, ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture, Washington, DC, USA, 2004, page 14.

Imagine: media processing with streams, B. Khailany, W.J. Dally, U.J. Kapasi, P. Mattson, J. Namkoong, J.D. Owens, B. Towles, A. Chang, and S. Rixner, Micro, IEEE 21(2):35-46, Mar/Apr 2001 link.

Tuesday, May 14 Program analysis PAper 19: Automatically characterizing large scale program behavior, Timothy Sherwood, , Erez Perelman, , Greg Hamerly, , and Brad Calder, , ASPLOS-X: Proceedings of the 10th international conference on Architectural support for programming languages and operating systems, New York, NY, USA, 2002, pages 45-57 link.

Paper 20: Limits of control flow on parallelism, Monica S. Lam and Robert P. Wilson, SIGARCH Comput. Archit. News 20(2):46-57, 1992 link.

Additional readings if you are interested:
Phase tracking and prediction, Timothy Sherwood, Suleyman Sair, and Brad Calder, SIGARCH Comput. Archit. News 31(2):336-349, 2003 link.

The intrinsic bandwidth requirements of ordinary programs, Andrew S. Huang and John Paul Shen, ASPLOS-VII: Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, New York, NY, USA, 1996, pages 105-114 link.

Limits on multiple instruction issue, M. D. Smith, , M. Johnson, , and M. A. Horowitz, , SIGARCH Comput. Archit. News 17(2):290-302, 1989 link.

Limits of instruction-level parallelism, David W. Wall, , ASPLOS-IV: Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, New York, NY, USA, 1991, pages 176-188 link.
Thursday, May 16 No class TBA
Tuesday, May 21 Slippage TBA
Thursday, May 23 Power Paper 21: Energy Optimization of Subthreshold-Voltage Sensor Network Processors, Leyla Nazhandali, Bo Zhai, Javin Olson, Anna Reeves, Michael Minuth, Ryan Helfand, Sanjay Pant, Todd Austin, and David Blaauw, ISCA '05: Proceedings of the 32nd annual international symposium on Computer Architecture, Washington, DC, USA, 2005, pages 197-207 link.

Paper 22: Temperature-aware microarchitecture: Modeling and implementation, Kevin Skadron, Mircea R. Stan, Karthik Sankaranarayanan, Wei Huang, Sivakumar Velusamy, and David Tarjan, ACM Trans. Archit. Code Optim. 1(1):94-125, 2004 link.

Tuesday, May 28 New Technologies Paper 23: Providing safe, user space access to fast, solid state disks, Adrian M. Caulfield, Todor I. Mollov, Louis Alex Eisner, Arup De, Joel Coburn, and Steven Swanson, Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems, New York, NY, USA, 2012, pages 387-400 link.

Paper 24: Architecting phase change memory as a scalable dram alternative, Benjamin C. Lee, Engin Ipek, Onur Mutlu, and Doug Burger, SIGARCH Comput. Archit. News 37:2-13, June 2009 link.

Thursday, May 30 Storage Paper 25: A case for redundant arrays of inexpensive disks (RAID), David A. Patterson, Garth Gibson, and Randy H. Katz, Proceedings of the 1988 ACM SIGMOD international conference on Management of data, New York, NY, USA, 1988, pages 109-116 link.

Paper 26: The case for RAMClouds: scalable high-performance storage entirely in DRAM, John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazieres, Subhasish Mitra, Aravind Narayanan, Guru Parulkar, Mendel Rosenblum, Stephen M. Rumble, Eric Stratmann, and Ryan Stutsman, SIGOPS Oper. Syst. Rev. 43:92-105, January 2010 link.

Tuesday, June 4 Wimpy Nodes Paper 27: FAWN: a fast array of wimpy nodes, David G. Andersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, and Vijay Vasudevan, Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, New York, NY, USA, 2009, pages 1-14 link.

Paper 28: Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications, Adrian M. Caulfield, Laura M. Grupp, and Steven Swanson, Proceedings of the 14th international conference on Architectural support for programming languages and operating systems, New York, NY, USA, 2009, pages 217-228 link.
Thursday, June 6 Potporri
Paper 29: Clearing the clouds: a study of emerging scale-out workloads on modern hardware, Michael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos, Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian Daniel Popescu, Anastasia Ailamaki, and Babak Falsafi, Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems, New York, NY, USA, 2012, pages 37-48 link.

Paper 30: Fundamental Latency Trade-off in Architecting DRAM Caches: Outperforming Impractical SRAM-Tags with a Simple and Practical Design, Moinuddin K. Qureshi and Gabe H. Loh, Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, 2012, pages 235-246 link.


Integrity Policy


Homework

Assignment 1: Administrivia
Assignment 2: Paper Reviews
Assignment 3: Class presentations