cse240b:  Parallel Architecture

Cell BE


Tuesday/Thursday
12:30-1:50
Center Hall 207
Instructors: Steven Swanson & Dean Tullsen

Email:
swanson

 at

cs
 dot
ucsd
dot edu

tullsen

 at 

cs
 dot 
ucsd
 dot 
edu
Offices:
Office hours:
Class mailing list: cse240b@cs.ucsd.edu archive

Syllabus

Text:
Related reading: Assignments:
Projects:
Grading:
Other notes:

Schedule

Date
Topic
Readings
January 9
Administrivia; Overview of parallel architecture; Introduction to Coherence (Tullsen, Swanson) slides
n/a
January 11
Coherence (Tullsen) slides
H&P 3rd ed.:6.1-6.6 (esp 6.3 and 6.5)
H&P 4th ed.:4.1-4.5
January 16 Coherence (Tullsen)
M. M. K. Martin, M. D. Hill, and D. A. Wood, ``Token coherence: decoupling performance and correctness,'' in ISCA '03: Proceedings of the 30th annual international symposium on Computer architecture, pp. 182-193, 2003 link.

D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, and J. Hennessy, ``The directory-based cache coherence protocol for the DASH multiprocessor,'' in ISCA '90: Proceedings of the 17th annual international symposium on Computer Architecture, pp. 148-159, 1990 link.
OPTIONAL (DASH performance) D. Lenoski, J. Laudon, T. Joe, D. Nakahira, L. Stevens, A. Gupta, and J. Hennessy, ``The DASH prototype: implementation and performance,'' in ISCA '98: 25 years of the international symposia on Computer architecture (selected papers), pp. 418-429, 1998 link.
January 18
Consistency (Swanson) Slides
S. V. Adve and K. Gharachorloo, ``Shared Memory Consistency Models: A Tutorial,'' tech. rep., DEC WRL, 1995 link.
January 23 Consistency (Swanson) Slides
V. S. Pai, P. Ranganathan, S. V. Adve, and T. Harton, ``An evaluation of memory consistency models for shared-memory systems with ILP processors,'' in ASPLOS-VII: Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, pp. 12-23, 1996 link.

C. Gniady, B. Falsafi, and T. N. Vijaykumar, ``Is SC + ILP = RC?,'' in ISCA '99: Proceedings of the 26th annual international symposium on Computer architecture, pp. 162-171, 1999 link.
OPTIONAL (but mind-bending) J. Manson, W. Pugh, and S. V. Adve, ``The Java memory model,'' in POPL '05: Proceedings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of programming languages, pp. 378-391, 2005 link.
January 25 Synchronization
H&P 3rd ed.: 6.7
H&P 4th ed.: 4.5


M. Herlihy, ``A methodology for implementing highly concurrent data structures,'' in PPOPP '90: Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming, pp. 197-206, 1990 link.
January 30 Transactions/Assignment 1 due
Slides
L. Hammond, V. Wong, M. Chen, B. D. Carlstrom, J. D. Davis, B. Hertzberg, M. K. Prabhu, H. Wijaya, C. Kozyrakis, and K. Olukotun, ``Transactional Memory Coherence and Consistency,'' in ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture, p. 102, 2004 link.

M. Herlihy and J. E. B. Moss, ``Transactional memory: architectural support for lock-free data structures,'' in ISCA '93: Proceedings of the 20th annual international symposium on Computer architecture, pp. 289-300, 1993 link.
February 1
Transactions
Slides
R. Rajwar, M. Herlihy, and K. Lai, ``Virtualizing Transactional Memory,'' in ISCA '05: Proceedings of the 32nd Annual International Symposium on Computer Architecture, pp. 494-505, 2005 link.

B. Saha and A.-R. A.-T. Q. Jacobson, ``Architectural Support for Software Transactional Memory,'' in MICRO '06: Proceedings of the 39th international symposium on Microarchitecture, 2006.
February 6 SMT/Threading
D.M. Tullsen, S.J. Eggers, J.M. Emer, H.M. Levy, J.L. Lo, and R.L. Stamm, ``Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor,'' in ISCA '96: Proceedings of the 23rd annual international symposium on Computer architecture, pp. 191-202, 1996 link.

A. Agarwal, R. Bianchini, D. Chaiken, K. L. Johnson, D. Kranz, J. Kubiatowicz, B.-H. Lim, K. Mackenzie, and D. Yeung, ``The MIT Alewife machine: architecture and performance,'' in ISCA '95: Proceedings of the 22nd annual international symposium on Computer architecture, pp. 2-13, 1995 link.
February 8 SMT/Threading
J. D. Collins, H. Wang, D. M. Tullsen, C. Hughes, Y.-F. Lee, D. Lavery, and J. P. Shen, ``Speculative precomputation: long-range prefetching of delinquent loads,'' in ISCA '01: Proceedings of the 28th annual international symposium on Computer architecture, pp. 14-25, 2001 link.

E. Tune, R. Kumar, D. M. Tullsen, and B. Calder, ``Balanced Multithreading: Increasing Throughput via a Low Cost Multithreading Hierarchy,'' in MICRO 37: Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, pp. 183-194, 2004 link.
February 10
Mini-project due

February 13 Mid-term

February 15 CMPs Slides
K. Olukotun, B. A. Nayfeh, L. Hammond, K. Wilson, and K. Chang, ``The case for a single-chip multiprocessor,'' in ASPLOS-VII: Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, pp. 2-11, 1996 link.

H. McGhan, ``Niagara 2 Opens the Floodgates,'' Microprocessor Report, November 2006 link.
February 20 CMPs Slides
R. Kumar, D. M. Tullsen, P. Ranganathan, N. P. Jouppi, and K. I. Farkas, ``Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance,'' in ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture, p. 64, 2004 link.

J. Huh, D. Burger, and S. W. Keckler, ``Exploring the Design Space of Future CMPs,'' in PACT '01: Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques, pp. 199-210, 2001 link.
February 22 Interconnects

H&P3 8.1-8.5, 8.9 or H&P4 E.1-E.6, E.10


February 27 Interconnects R. Kumar, V. Zyuban, and D. M. Tullsen, ``Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling,'' in ISCA '05: Proceedings of the 32nd Annual International Symposium on Computer Architecture, pp. 408-419, 2005 link.


March 1 Tiled Architectures K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D. Burger, S. W. Keckler, and C. R. Moore, ``Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture,'' SIGARCH Comput. Archit. News, vol. 31, no. 2, pp. 422-433, 2003 link.

S. Swanson, A. Schwerin, M. Mercaldi, A. Petersen, A. Putnam, K. Michelson, M. Oskin, and S. J. Eggers, ``The WaveScalar Architecture.''
To Appear in ACM Transactions On Computer Systems. link
March 6 Tiled Architectures: RAW (Mike Taylor)
M. B. Taylor, J. Kim, J. Miller, D. Wentzlaff, F. Ghodrat, B. Greenwald, H. Hoffman, P. Johnson, J.-W. Lee, W. Lee, A. Ma, A. Saraf, M. Seneski, N. Shnidman, V. Strumpen, M. Frank, S. Amarasinghe, and A. Agarwal, ``The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs,'' IEEE Micro, vol. 22, no. 2, pp. 25-35, 2002 link.

M. B. Taylor, W. Lee, S. Amarasinghe, and A. Agarwal, ``Scalar Operand Networks: On-Chip Interconnect for ILP in Partitioned Architectures,'' in HPCA '03: Proceedings of the 9th International Symposium on High-Performance Computer Architecture, p. 341, 2003 link.
March 8 Non General-Purpose/streaming machines M. Baron, ``The Cell, At One,'' Microprocessor Report, March 2006 link.

J. H. Ahn, W. J. Dally, B. Khailany, U. J. Kapasi, and A. Das, ``Evaluating the Imagine Stream Architecture,'' in ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture, p. 14, 2004 link.
March 13 Project Presentations (Students)

March 15 Project Presentations (Students)

March 16
Project Report due (midnight)

March 20
Final


Assignment 1:  Mini-project warmup

Due January 30th.  No extensions.  Described here.

Assignment 2: Mini-project

Due February 9th.  No extensions.  Described here.

Assignment 3: Project

Presentations in week 10.  Write up due 11:59PM Friday, the 16th.

Deliverables:
  1. In-class presentation.  ~16 minutes.  You should describe the questions you are trying to answer, your approach, the tools you used and/or the code you wrote, and your results.  You should also outline what you would do if you were to continue working on the project.
  2. Written report.  Roughly the same content as your presentation but in more depth.  The report should be between around 5 to 10 pages.  Your audience is other students in this class.  Be sure to include appropriate citations (if you have questions about what this means, please ask).