Disks, Filesystems

February 28 2007

Leftovers

  1. What's a TLB? What are they good for?
  2. Suppose I have four physical pages of memory, and I am doing LRU page replacement. What sequence of virtual page accesses will cause a page fault on every access?
  3. Is there any point to putting 8GB RAM in a 32-bit machine?
  4. When the processor is running user programs, the processor generates virtual addresses that must be translated to physical addresses before going to memory. How can processor caches fit into the picture? What are the tradeoffs involved?

Lecture Review

  1. File Systems
  2. FFS, LFS, and RAID

Project 2

Questions?

Disks

Disks are pathetically slow compared to processors and memory. Remember that disks have moving parts. Processors and memory work by pushing electrons through wires, but disks work by physically moving around little pieces of metal.

When we read or write to a disk, we need to wait for three things:

Seek
We need to move the read/write head to the desired track.
Rotation
We need to wait for the desired part of the desired track to rotate under the read/write head.
Transfer
We need to wait for the read/write head to complete the read/write operation.

Seeks take a lot of time. Remember, moving parts. An average seek time is around 5-10 ms.

Rotation delays also take a lot of time, but they're generally not as bad as seek times. For example, suppose I have a hard drive that spins at 7200 RPM. What is the average rotation delay? Well, I can do some simple math to figure out how long a full rotation will take:

     1 min            60 sec     1000 ms        8.3 ms
------------------ * -------- * --------- = --------------
 7200 revolutions     1 min       1 sec      1 revolution

If I am randomly reading data from a track, I will have to wait half a rotation on average, which works out to ~4 ms. If I have a 15,000 RPM hard drive, I will spend about half as long waiting for rotation delays.

Transfer times are fast, compared to seek and rotation delays. Transfer times are directly related to hard drive density (how many bits can you fit in a square inch?), and history has shown that it's easier to make more dense hard drives than hard drives that spin faster or seek faster. An average hard drive can read at around 50 MB/s (after seek and rotation delays). If a file system data block is 4K, how long will it take to read the block?

 4 KBytes       1 MByte         1 sec       1000 ms     0.08 ms
---------- * ------------- * ----------- * --------- = ---------
 1 block      1024 KBytes     50 MBytes      1 sec      1 block

Filesystems

Disks are block devices. They only know how to deal with blocks of data. So to use a disk, you say things like "read block number 123", or "write this data to block number 700". This level of access is far too basic for most users, which is why we have filesystems. Filesystems provide the abstractions of files and directories from raw block storage provided by disks.

A typical UNIX filesystem consists of two key components:

Superblock
The superblock represents a filesystem. It contains metadata about the filesystem, and, most importantly, a pointer to the root directory. Once we know where the root directory is, we can find any file in the filesystem by traversing more directories. The superblock is like the head node in a linked list or tree.
inodes
inodes represent files. Each inode contains a lot of metadata about each file, such as file size, owner, permissions, create/modify/access time, and pointers to the data blocks in each file. Since a file can contain a large number of data blocks, inodes use trees of pointers to point to data blocks in large files.

Specifically, the inode contains direct pointers to the file's first 12 data blocks. After that, there is a "single indirect" pointer that points to a second-level data block that contains pointers to more of the file's data blocks. This second-level data block is just a data block full of pointers to more of the file's data blocks.

When those data block pointers are exhausted, the inode has a "double indirect" pointer that has two levels of data block pointers. When those pointers are exhausted, the inode has a final "triple indirect" pointer that provides three levels of data block pointers.

Suppose my block size is 4K and data block numbers are 32-bits. If I only use an inode's direct pointers, what is the maximum file size? What if I only use direct and single-indirect?

Why do we bother with these direct, single-indirect, double-direct, and triple-indirect pointers? Why not just have everything triple-indirect?

So we use a superblock to represent a filesystem, and we use inodes to represent files. How do we represent directories? Conceptually, a directory is just a data structure that maps filenames to inodes. For example:

Filename inode
foo 775
bar 325
... ...

It turns out a directory is just a file that has a special "this-is-a-directory" bit set in its inode. So what does a directory look like on the inside? Well, a directory is just a file, and files are just strings of bytes. So a directory looks like you'd imagine, with the following data laid out linearly:

inode number
length of filename
filename

inode number
length of filename
filename

...

So a directory is just this linear string of bytes that says how filenames map to inodes. To make a directory, we create a string of bytes that indicates how the files in the directory map to inodes, put the string of bytes into a file, and set the special "this-is-a-directory" bit on the inode.

Questions

  1. Suppose I want to read /etc/motd. What does the filesystem need to do?
  2. Suppose I want to run ls on the current directory. What does the filesystem need to do? What about ls -l?
  3. On UNIX systems with thousands of users, home directories are often not of the form /home/jeremy, but instead something like /home/j/je/jeremy. Why is this done?
  4. What are the differences between hard links and symbolic links?
  5. [from Michael] Many Unix systems allow multiple hard links to files, but do not allow hard links to be created to directories. Why do you think this is?