Disks are pathetically slow compared to processors and memory. Remember that disks have moving parts. Processors and memory work by pushing electrons through wires, but disks work by physically moving around little pieces of metal.
When we read or write to a disk, we need to wait for three things:
Seeks take a lot of time. Remember, moving parts. An average seek time is around 5-10 ms.
Rotation delays also take a lot of time, but they're generally not as bad as seek times. For example, suppose I have a hard drive that spins at 7200 RPM. What is the average rotation delay? Well, I can do some simple math to figure out how long a full rotation will take:
1 min 60 sec 1000 ms 8.3 ms ------------------ * -------- * --------- = -------------- 7200 revolutions 1 min 1 sec 1 revolution
If I am randomly reading data from a track, I will have to wait half a rotation on average, which works out to ~4 ms. If I have a 15,000 RPM hard drive, I will spend about half as long waiting for rotation delays.
Transfer times are fast, compared to seek and rotation delays. Transfer times are directly related to hard drive density (how many bits can you fit in a square inch?), and history has shown that it's easier to make more dense hard drives than hard drives that spin faster or seek faster. An average hard drive can read at around 50 MB/s (after seek and rotation delays). If a file system data block is 4K, how long will it take to read the block?
4 KBytes 1 MByte 1 sec 1000 ms 0.08 ms ---------- * ------------- * ----------- * --------- = --------- 1 block 1024 KBytes 50 MBytes 1 sec 1 block
Disks are block devices. They only know how to deal with blocks of data. So to use a disk, you say things like "read block number 123", or "write this data to block number 700". This level of access is far too basic for most users, which is why we have filesystems. Filesystems provide the abstractions of files and directories from raw block storage provided by disks.
A typical UNIX filesystem consists of two key components:
Specifically, the inode contains direct pointers to the file's first 12 data blocks. After that, there is a "single indirect" pointer that points to a second-level data block that contains pointers to more of the file's data blocks. This second-level data block is just a data block full of pointers to more of the file's data blocks.
When those data block pointers are exhausted, the inode has a "double indirect" pointer that has two levels of data block pointers. When those pointers are exhausted, the inode has a final "triple indirect" pointer that provides three levels of data block pointers.
Suppose my block size is 4K and data block numbers are 32-bits. If I only use an inode's direct pointers, what is the maximum file size? What if I only use direct and single-indirect?Why do we bother with these direct, single-indirect, double-direct, and triple-indirect pointers? Why not just have everything triple-indirect?
So we use a superblock to represent a filesystem, and we use inodes to represent files. How do we represent directories? Conceptually, a directory is just a data structure that maps filenames to inodes. For example:
It turns out a directory is just a file that has a special "this-is-a-directory" bit set in its inode. So what does a directory look like on the inside? Well, a directory is just a file, and files are just strings of bytes. So a directory looks like you'd imagine, with the following data laid out linearly:
inode number length of filename filename inode number length of filename filename ...
So a directory is just this linear string of bytes that says how filenames map to inodes. To make a directory, we create a string of bytes that indicates how the files in the directory map to inodes, put the string of bytes into a file, and set the special "this-is-a-directory" bit on the inode.