Final Review

March 14 2007

Lecture Review

  1. Virtual Machine Monitors
  2. Internet Outbreaks

Project 3

Questions? There seems to be some confusion on how to support command line arguments with demand paging. We can go over that if you like.

RAID

RAID takes a bunch of disks and uses them to create one big virtual disk. Depending on the specific technique used, this can result in improved reliability, performance, or both.

The different kinds of RAID are named with numbered "levels". Very important: higher RAID levels are not necessarily better. The different RAID levels split data across multiple disks in different ways, and they each have advantages and disadvantages. This means that RAID 5 is not always better than RAID 0. Whether it is better or not depends entirely on how the disks will be used: how much do we care about performance? How much disk space are we willing to sacrifice for reliability?

That said, here are four popular RAID configurations, and what they do:

RAID 0
Stripe data across all the disks in the array. If we have two disks, for example, we would put all the even numbered blocks on one disk, and all the odd numbered blocks on the other disk.
RAID 1
Mirror data across all disks in the array. Every disk in the array has a copy of every data block.
RAID 4
Requires at least three disks. If we have three disks, we stripe data across the first two (like RAID 0), and the third disk contains the bitwise XOR of the first two disks. This third disk is called the parity disk.
RAID 5
RAID 5 is an extension of RAID 4. RAID 4 creates a bottleneck at the parity disk, because every write needs to update the parity disk, even if the writes are going to completely different data blocks. RAID 5 works around this by not using a dedicated parity disk - instead, the parity blocks are spread across all disks.

For example, we can put the first and second data blocks on the first two disks, and the parity for the first two blocks on the third disk, but the third and fourth data blocks can be placed on the last two disks, and the parity for the third and fourth blocks would be placed on the first disk.

Homework 5 Pizza Monitor

I'll spend some time going over the pizza monitor.

Review

We can go over the sample final if you like.

Questions

  1. How can we dynamically change a virtual machine's memory allocation with balloon drivers?
  2. How does RAID 4 and 5 recover data using the parity blocks?
  3. How does read and write performance compare across different RAID levels?
  4. Why do we need to worry about metadata consistency?
  5. What are the differences between hard links and symbolic links?
  6. [from Michael] Many Unix systems allow multiple hard links to files, but do not allow hard links to be created to directories. Why do you think this is?