CSE 260 - Parallel Computation
Latest Announcements
Suggested reading (but we won't discuss it):
The MicroGrid (from Andrew's Chien's Concurrent Systems
Architecture Group).
Update on step 4 of project and last assignment
(due THURSDAY)!
I've just heard from the SDSC consultants that if you want to
compile an OpenMP C program, you can use the compiler
/usr/local/apps/KAP/guide39/bin/guidec
The last assignment (in place of BOTH step 4 of the project and the
previous "last assignment") is to do one of the following by the last
class. I actually think the first would be the most educational,
especially if you've never written a Fortran program.
- OpenMP version of project in Fortran, using a triangular
sheet of metal instead of the square one. You needn't bother
with optimizing the code. But you should try
different scheduling options (static and dynamic and using
different chunksizes so that you get the effect of both
block and block cyclic distributions.)
Incidentally, I recommend using at least 50
timesteps, so that any startup costs (like allocating and
initializing the arrays) get amortized over many timesteps.
The goal is to compute the parallel efficiency for the various
strategies on small, medium and large problem sizes.
Here are some hints on using
OpenMP on the Sun ultra.
- A second mini-project (1-2 page writeup; no class presentation).
The goal is to teach me something. (There's lots I don't know,
so it shouldn't be hard.)
- Pthreads version of project. You're on your own in learning
Pthreads.
- Explore results of project. Specifically, try making the constant
(currently .1) larger (e.g. .2, .5, 1.0, ...) and looking at the
output. To "look at" the output, I suggest you find some convenient
visualization package you can run on your workstation, and ship the
data to yourself so you can run it locally. The goal is to have
an animation of multiple timesteps. To keep the data size and
animation speed reasonable, you'll want to use a
a relatively small problem size (perhaps 32x32). But to see anything
interesting, you may need to run many timesteps - perhaps 10,000.
There's no need feed all the timesteps to the visualizer - you
can try outputting every 5th or 10th or whatever.
Class Notes
Class 1 in PowerPoint
or pdf
format.
Class 2 in PowerPoint
or pdf
format.
Programming Parallel Computers class in
PowerPoint
or pdf
format.
PDE's for Dummies class in
PowerPoint
or pdf
format.
Parallel Performance class in
PowerPoint
or pdf
format.
Model of Parallel Computers classes in
PowerPoint
or pdf
format.
Performance Programming class in
PowerPoint
or pdf
format.
Benchmarks and Applications class in
PowerPoint
or pdf
format.
Quizzlet 3 answers.
Assignments
There will be a multi-part project involving writing and tuning
a relatively simple parallel program. We'll explore improving
single-node performance, and writing code using both shared-memory
and distributed address space paradigms.
A description of the project in
PowerPoint
or pdf
format. Part 3 is due November 15.
Information courtesy of Sunjeev Sikand: To get pthreads to assign
one thread per processor, you need to declare them as system threads.
Putting the following into your C program should work:
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setscope (%attr, PTHREAD_SCOPE_SYSTEM);
pthread_create (&threadid[i], &attr, start_routine, arg);
OpenMP reference manual for C.
And for Fortran.
Here is information (adapted from Kathy Yelick's class) on
MPI.
Also available in
.pdf
format.
Here is some information on
profiling
and timing programs. Also available in
.pdf
format.
Mini-projects
Each student should do two mini-projects during the term. Here
are some of the completed projects:
Course information
CSE260 is an overview of parallel hardware, algorithms, models
and software. Topics include parallel computer architectures, a
survey of commercally available multiprocessors,
parallel algorithm paradigms and complexity,
parallel programming languages, environments and tools, and
an introduction to scientific applications that are often
run on supercomputers.
Instructor:
Larry Carter.
Class times: Tuesdays and Thursdays, 9:35-10:55,
Room 2209 Warren Lecture Hall.
Office hours: Monday and Wednesdays, 10:00-11:00 or by
appointment (or drop by). My office is AP&M 4101.
Related material
The UltraSPARC User's Manual for the processor
in the Sun Enterprise 10000 used in our project.
Ian Foster's on-line textbook,
Designing and Building Parallel Programs.
A listing of some
supercomputers.
An overview of research into using
object-oriented
languages and tools for parallel computation, compiled by Dennis
Gannon. Note that this is from 1995, and so (for instance) doesn't
mention anything about High-Performance Java efforts (such as
Jalapino and Titanium). See Angela Molnar's mini-project
on what has happened with some of these efforts.
Slides used in a tutorial on single-processor optimization in
PostScript or
PDF Format.