Performance Programming

A one-day course developed by Bowen Alpern and Larry Carter

Course Outline

These links go to large .ps files of slides used in class presentations. They are not designed to be viewed from a PostScript previewer. It may be necessary to reduce the viewer's magnification in order to see entire slides.

1. Introduction

What is performance programming? When is it justified? The scientific method. Visualizing computations. Extended example: seismic migration.

2. Architecture for the Performance Programmer

Extended example: unblocked matrix multiplication. The Two Level model of computation. Locality: local, semilocal, and nonlocal data passes. The memory hierarchy and the parallelism hierarchy.

3. Localize, Parallelize, Pipeline

The three basic techniques of performance programming. Extended example: the NAS CG benchmark.

4. Miscellaneous Tips and Techniques

Timing and profiling. Reading assembly code. Integer arithemetic in floating point, message compression, and other tricks. Example: the NAS EP benchmark.

5. Portable High Performance

A few half-baked ideas on how the impossible might be achieved.

6. Review