CSE 221: System Measurement Project

Winter 2005

Due:
   1) Intro, Machine Description, and CPU (draft): Thursday, March 3, 2005 in class
   2) Final report with all measurements plus code: Wednesday, March 16, 2005 at noon

Overview

In building an operating system, it is important to be able to determine the performance characteristics of underlying hardware components (CPU, RAM, disk, network, etc.), and to understand how their performance influences or constrains operating system services. Likewise, in building an application, one should understand the performance of the underlying hardware and operating system, and how they relate to the user's subjective sense of that application's "responsiveness". While some of the relevant quantities can be found in specs and documentation, many must be determined experimentally. While some values may be used to predict others, the relations between lower- and higher-level performance are often subtle and non-obvious.

In this project, you will create, justify, and apply a set of experiments to a system to characterize and understand its performance. In addition, you may explore the relations between some of these quantities. In doing so, you will study how to use benchmarks to usefully characterize a complex system. You should also gain an intuitive feel for the relative speeds of different basic operations, which is invaluable in identifying performance bottlenecks.

You may work either alone or in two-person groups. In groups, both members receive the same grade. If collaboration issues arise, contact me as soon as possible: flexibility in dealing with such issues decreases as the deadline approaches.

This project has two parts. First, you will implement and perform a series of experiments. Second, you will write a report documenting the methodology and results of your experiments. When you finish, you will submit your report as well as the code used to perform your experiments.

Report

Your report will have a number of sections including an introduction, a machine description, and descriptions and discussions of your experiments.

1) Introduction

Describe the goals of the project and, if you are in a group, who performed which experiments. State the language you used to implement your measurements, and the compiler version and optimization settings you used to compile your code. Estimate the amount of time you spent on this project.

2) Machine Description

Your report should contain a reasonably detailed description of the test machine(s). The relevant information should be available either from the system (e.g. sysctl on BSD, /proc on Linux, System Profiler on Mac OS X), or online. You will not be graded on this part, and it should not require much work, but in explaining and analyzing your results you will find these numbers useful. You should report at least the following quantities:
  1. Processor: model, cycle time, cache sizes (L1, L2, instruction, data, etc.).
  2. Memory bus.
  3. I/O bus.
  4. RAM size.
  5. Disk: capacity, RPM, controller cache size.
  6. Network card speed.
  7. Operating system (including version/release)

3) Experiments

Perform your experiments by following these steps:
  1. Estimate the base hardware performance of the operation and cite the source you used to determine this quantity (system info, a particular document). For example, when measuring disk read performance for a particular size, you can refer to the disk specification (easily found online) to determine seek, rotation, and transfer performance. Based on these values, you can estimate the average time to read a given amount of data from the disk assuming no software overheads.
  2. Make a guess as to how much overhead the OS will add to the base hardware performance. For a disk read, this will include the system call, arranging the read I/O operation, handling the completed read, and copying the data read into the user buffer. We will not grade you on your guess, this is for you to test your intuition. (Obviously you can do this after performing the experiment to derive an accurate "guess", but where's the fun in that?)
  3. Combine the base hardware performance and your estimate of software overhead into an overall prediction of performance.
  4. Implement and perform the measurement. In all cases, you should run your experiment multiple times, for long enough to obtain repeatable measurements, and average the results.
In your report:
  1. Clearly explain the methodology of your experiment.
  2. Present your results:
    1. For measurements of single quantities (e.g., system call overhead), use a table to summarize your results. In the table report the base hardware performance, your estimate of software overhead, your prediction of operation time, and your measured operation time.
    2. For measurements of operations as a function of some other quantity, report your results as a graph with operation time on the y-axis and the varied quantity on the x-axis. Include your estimates of base hardware performance and overall prediction of operation time as curves on the graph as well.
  3. Discuss your results:
    1. Cite the source for the base hardware performance.
    2. Compare the measured performance with the predicted performance. If they are wildly different, speculate on reasons why. What may be contributing to the overhead?
    3. Evaluate the success of your methodology. How accurate do you think your results are?
    4. For graphs, explain any interesting features of the curves.
    5. Answer any questions specifically mentioned with the operation.

Do not underestimate the time it takes to describe your methodology and results.

4) Operations

  1. CPU, Scheduling, and OS services
    1. Procedure call overhead: Report as a function of number of integer arguments from 0-7. What is the increment overhead of an argument?
    2. System call overhead: Report the cost of a minimal system call. How does it compare to the cost of a procedure call?
    3. Task creation time: Report the time to create and run both a process and a kernel thread. How do they compare?
    4. Context switch time: Report the time to context switch from one process to another, and from one kernel thread to another. How do they compare?

  2. Memory
    1. RAM access time: Report latency for integer accesses to main memory and the L1 and L2 caches.
    2. RAM bandwidth: Report bandwidth for both reading and writing.

  3. Network
    1. Round trip time.
    2. Peak bandwidth.
    3. Connection overhead: Report setup and tear-down.

    Evaluate for the TCP protocol. For each quantity, compare both remote and loopback interfaces. Comparing the remote and loopback results, what can you deduce about baseline network performance and the overhead of OS software? For both round trip time and bandwidth, how close to ideal hardware performance do you achieve? In describing your methodology for the remote case, either provide a machine description for the second machine (as above), or use two identical machines.

  4. File System
    1. Size of file cache: Note that this may be very sensitive to other load on the machine.
    2. File read time: Report for both sequential and random access as a function of file size. Discuss the sense in which your "sequential" access might not be sequential. Ensure that you are not measuring cached data.
    3. Remote file read time: Repeat the previous experiment for a remote file system. What is the "network penalty" of accessing files over the network?
    4. Contention: Report the average time to read one file system block of data as a function of the number of processes simultaneously performing the same operation on different files on the same disk (and not in the file buffer cache).

References

During the quarter you have read a number of papers describing various system measurements, including V, Sprite, microkernels, Scheduler Activations, LRPC, LFS, and IO-Lite. You may find these papers useful as references.

In addition, other papers you may find useful for help with system measurement are:

You may read these papers, or other references, for strategies on performing measurements, but you may not examine code to copy or replicate the implementation of a measurement. For example, reading the lmbench paper is fine, but downloading and looking at the lmbench code violates the intent of the project.

Finally, it goes almost without saying that you must implement all of your measurements. You may not download a tool to perform the measurements for you.

Grading

We will grade your project on the relative accuracy of your measurement results (disk reads performing faster than the buffer cache are a bad sign) as well as the quality of your report in terms of methodology description (can we understand what you did and why?), discussion of results (answering specific questions, discussing unexpected behavior), and the writing (lazy writing will hurt your grade).


voelker@cs.ucsd.edu