CSE 221: System Measurement Project
Fall 2009
Deadlines
- Draft of Intro, Machine Description, and CPU Operations:
Tuesday, October 27 at 3pm (start of class)
- Draft of Memory Operations:
Thursday, November 12 at 3pm (start of class)
- Final report with all measurements plus code:
Wednesday, December 9 at noon
Overview
In building an operating system, it is important to be able to
determine the performance characteristics of underlying hardware
components (CPU, RAM, disk, network, etc.), and to understand how
their performance influences or constrains operating system services.
Likewise, in building an application, one should understand the
performance of the underlying hardware and operating system, and how
they relate to the user's subjective sense of that application's
"responsiveness". While some of the relevant quantities can be found
in specs and documentation, many must be determined experimentally.
While some values may be used to predict others, the relations between
lower-level and higher-level performance are often subtle and
non-obvious.
In this project, you will create, justify, and apply a set of
experiments to a system to characterize and understand its
performance. In addition, you may explore the relations between some
of these quantities. In doing so, you will study how to use
benchmarks to usefully characterize a complex system. You should also
gain an intuitive feel for the relative speeds of different basic
operations, which is invaluable in identifying performance bottlenecks.
You have complete choice over the operation system and hardware
platform for your measurements. You can use your laptop that you are
comfortable with, an operating system running in a virtual machine
monitor, or even a supercomputer.
You may work either alone or in two-person or three-person groups.
Groups do the same project as individuals. All members receive the
same grade. Note that working in groups may or may not make the
project easier, depending on how the group interactions work out. If
collaboration issues arise, contact me as soon as possible:
flexibility in dealing with such issues decreases as the deadline
approaches.
This project has two parts. First, you will implement and perform
a series of experiments. Second, you will write a report documenting
the methodology and results of your experiments. When you finish, you
will submit your report as well as the code used to perform your
experiments.
Report
Your report will have a number of sections including an
introduction, a machine description, and descriptions and discussions
of your experiments.
1) Introduction
Describe the goals of the project and, if you are in a group, who
performed which experiments. State the language you used to implement
your measurements, and the compiler version and optimization settings
you used to compile your code. Estimate the amount of time you spent
on this project.
2) Machine Description
Your report should contain a reasonably detailed description of the
test machine(s). The relevant information should be available either
from the system (e.g., sysctl
on BSD, /proc
on Linux, System Profiler on Mac OS X), or online. Gathering this information
should not require much work, but in explaining and analyzing your
results you will find these numbers useful. You should report at
least the following quantities:
- Processor: model, cycle time, cache sizes (L1, L2, instruction,
data, etc.)
- Memory bus
- I/O bus
- RAM size
- Disk: capacity, RPM, controller cache size
- Network card speed
- Operating system (including version/release)
3) Experiments
Perform your experiments by following these steps:
- Estimate the base hardware performance of the operation and cite
the source you used to determine this quantity (system info, a
particular document). For example, when measuring disk read
performance for a particular size, you can refer to the disk
specification (easily found online) to determine seek, rotation, and
transfer performance. Based on these values, you can estimate the
average time to read a given amount of data from the disk assuming no
software overheads.
- Make a guess as to how much overhead the OS will add to the base
hardware performance. For a disk read, this will include the system
call, arranging the read I/O operation, handling the completed read,
and copying the data read into the user buffer. We will not grade you
on your guess, this is for you to test your intuition. (Obviously you
can do this after performing the experiment to derive an accurate
"guess", but where's the fun in that?)
- Combine the base hardware performance and your estimate
of software overhead into an overall prediction of performance.
- Implement and perform the measurement. In all cases, you should
run your experiment multiple times, for long enough to obtain
repeatable measurements, and average the results. Also compute the
standard deviation across the measurements.
- Use a low-overhead mechanism for reading timestamps. All modern
processors have a cycle counter that applications can read using a
special instruction (e.g., RDTSC).
Searching for "RDTSC" in Google, for instance, will provide you with a
plethora of additional examples.
In your report:
- Clearly explain the methodology of your experiment.
- Present your results:
- For measurements of single quantities (e.g., system call
overhead), use a table to summarize your results. In the table
report the base hardware performance, your estimate of software
overhead, your prediction of operation time, and your measured
operation time.
- For measurements of operations as a function of some other
quantity, report your results as a graph with operation time on the
y-axis and the varied quantity on the x-axis. Include your estimates
of base hardware performance and overall prediction of operation time
as curves on the graph as well.
- Discuss your results:
- Cite the source for the base hardware performance.
- Compare the measured performance with the predicted performance.
If they are wildly different, speculate on reasons why. What
may be contributing to the overhead?
- Evaluate the success of your methodology. How accurate
do you think your results are?
- For graphs, explain any interesting features of the curves.
- Answer any questions specifically mentioned with the operation.
- At the end of your report, summarize your results in a table for
a complete overview. The columns in your table should include
"Operation", "Base Hardware Performance", "Estimated Software
Overhead", "Predicted Time", and "Measured Time". (Not required for
the draft.)
- State the units of all reported values.
Do not underestimate the time it takes to describe your methodology
and results.
4) Operations
- CPU, Scheduling, and OS Services
- Measurement overhead:
Report the overhead of reading time, and report
the overhead of using a loop to measure many iterations
of an operation.
- Procedure call overhead:
Report as a function of number of integer arguments from 0-7.
What is the increment overhead of an argument?
- System call overhead:
Report the cost of a minimal system call. How does it
compare to the cost of a procedure call?
- Task creation time:
Report the time to create and run both a process and
a kernel thread. How do they compare?
- Context switch time:
Report the time to context switch from one process to
another, and from one kernel thread to another. How
do they compare?
- Memory
- RAM access time: Report latency for individual integer
accesses to main memory and the L1 and L2 caches. Present
results as a graph with the x-axis as the log of the size
of the memory region accessed, and the y-axis as the
average latency. (In terms of the lmbench paper, measure
the "back-to-back-load" latency and report your results in
a graph similar to Fig. 1 in the paper.)
- RAM bandwidth:
Report bandwidth for both reading and writing.
- Page fault service time:
Report the time for faulting an entire page from disk.
Dividing by the size of a page, how does it compare to the
latency of accessing a byte from main memory?
- Network
- Round trip time. Compare with the time to perform
a ping (ICMP requests are handled at kernel level).
- Peak bandwidth.
- Connection overhead: Report setup and tear-down.
Evaluate for the TCP protocol. For each quantity, compare both
remote and loopback interfaces. Comparing the remote and
loopback results, what can you deduce about baseline network
performance and the overhead of OS software? For both round
trip time and bandwidth, how close to ideal hardware performance
do you achieve? In describing your
methodology for the remote case, either provide a machine
description for the second machine (as above), or use two
identical machines.
- File System
- Size of file cache: Note that this may be very sensitive
to other load on the machine. Report results as a graph
whose x-axis is the size of the file being accessed and
the y-axis is the average read I/O time. Do not use
a system call or utility program to determine this metric
except to sanity check.
- File read time: Report for both sequential and random access
as a function of file size. Discuss the sense in which
your "sequential" access might not be sequential. Ensure
that you are not measuring cached data (e.g., use the
raw device interface). Report as a graph with a log/log
plot with the x-axis the size of the file and y-axis the
average per-block time.
- Remote file read time: Repeat the previous experiment for
a remote file system. What is the "network penalty" of
accessing files over the network?
- Contention: Report the average time to read one file
system block of data as a function of the number of
processes simultaneously performing the same operation on
different files on the same disk (and not in the file
buffer cache).
References
During the quarter you will have read a number of papers
describing various system measurements, including V, Sprite,
microkernels, Scheduler Activations, LRPC, LFS, and IO-Lite. You may
find these papers useful as references.
In addition, other papers you may find useful for help with system
measurement are:
- John K. Ousterhout, Why
Aren't Operating Systems Getting Faster as Fast as Hardware?,
Proc. of USENIX Summer Conference, pp. 247-256, June 1990.
- J. Bradley Chen, Yasuhiro Endo, Kee Chan, David Mazieres,
Antonio Dias, Margo Seltzer, and Michael D. Smith, The
measured performance of personal computer operating systems,
Proc. of ACM SOSP, pp. 299-313, December 1995.
- Larry McVoy and Carl Staelin, lmbench:
Portable Tools for Performance Analysis, Proc. of USENIX Annual
Technical Conference, January 1996.
- Aaron B. Brown and Margo I. Seltzer, Operating
system benchmarking in the wake of lmbench: a case study of the
performance of NetBSD on the Intel x86 architecture, Proc. of ACM
SIGMETRICS, pp. 214-224, June 1997.
You may read these papers, or other references, for strategies on
performing measurements, but you may not examine code to copy or
replicate the implementation of a measurement. For example, reading
the lmbench paper is fine, but downloading and looking at the
lmbench code violates the intent of the project.
Finally, it goes almost without saying that you must implement all
of your measurements. You may not download a tool to perform the
measurements for you.
Grading
We will grade your project on the relative accuracy of your
measurement results (disk reads performing faster than the buffer
cache are a bad sign) as well as the quality of your report in terms
of methodology description (can we understand what you did and why?),
discussion of results (answering specific questions, discussing
unexpected behavior), and the writing (lazy writing will hurt your grade).
In the past, a frequent issue we see with project reports is that
they do not clearly explain the reasoning behind the estimates,
methodology, results, etc. As a result, we do not fully understand
what you did and why you did it that way. Be sure to explain your
reasoning as well.
As a first stage of the project, we would like you to submit an
early draft of the first part of the project. What should you cover
in the draft? The first two parts of the report (Introduction and Machine Description), and the first set of
operations (CPU, Scheduling, and
OS Services). For this step only submit a draft of the report,
not your code.
What percentage of the project grade does it form? It will only
be worth 5% of your grade. Why so little? The idea with the initial
draft is that it is primarily for your own benefit: it will get you
started on the project early, and it will give you a sense for how
long it will take you to complete the project by the end of the
quarter (in the past, students have reported that it has taken them
40-120 hours on the project). As a result, you should be able to
better budget your time as the end of the quarter arrives. How rough
can the draft be? Your call — again, this is primarily for your
benefit.
In the second stage of the project, extend your report draft
with results for the second set of operations (Memory).
For the drafts, bring one hardcopy per group with you to class on
the deadline.
For the final project reports, submit them in pdf to the TA
via email. In addition, submit your code as well.
Also for the final project, we would like you to fill out an XML
template with your results and submit this along with the rest of your
project. The XML template can be found
here. It should only
take about 15 minutes to complete. Basically you will need to input
machine information and the results from your experiments (i.e. the value,
units tag in the appropriate section in the XML file). The notes tag
is optional, and for any results you might not have, you can leave them
blank. You can find an example of the completed XML template
here. Also, feel free
to use this php form that
collects the results you input and outputs a correctly formatted XML
file in the browser for you to save locally.