Due: Friday, June 6, at 11:59pm (no late submissions)
In project 2, each process had a page table that was initialized with physical pages and their contents when the process was created. In project 3, you will implement a more sophisticated memory management system where physical pages are allocated on demand and pages that cannot fit in physical memory will be stored on disk.
You will implement and debug virtual memory in two steps. First, you will implement demand paging using page faults to dynamically initialize process virtual pages on demand, rather than initializing page frames for each process in advance at exec time as you did in project 2. Next, you will implement page replacement, enabling your kernel to evict a virtual page from memory to free up a physical page frame to satisfy a page fault. Demand paging and page replacement together allow your kernel to "overbook" memory by executing more processes than would fit in machine memory at any one time, using page faults to multiplex the available physical page frames among the larger number of process virtual pages. When implemented correctly, virtual memory is undetectable to user programs unless they monitor their own performance.
You project will implement the following functionality:
The changes you make to Nachos will be in these two files in the vm directory:
These classes inherit from UserKernel and UserProcess. While the VM versions of these classes will be able to depend upon functionality in the base classes, the focus in this project will be on demand-paged virtual memory. As a result, you will be implementing new versions of key methods in VMProcess such as loadSections, readVirtualMemory, and writeVirtualMemory.
You will compile and run the project in the proj3 directory. Unlike the first two projects, you will not need to learn any new Nachos modules and will continue to use functionality that you became familiar with in project 2. Before starting your implementation, also see the Tips section below.
The following design aspects are central to this project:
Swap File. To manage swapped out pages on disk, use the StubFileSystem (via ThreadedKernel.fileSystem) as in project 2. There are many design choices, but we suggest using a single, global swap file across all processes. This file should last the lifetime of your kernel. Be sure to choose a reasonably unique file name that will not conflict with other files in the test directory.
When designing the swap file, keep in mind that the units of swap are pages. Thus you should be efficient with disk space using the same techniques applied in virtual memory: any gaps in your swap space due to processes terminating should be used by future processes. As with physical memory in project 2, a global free list works well. You can assume that the swap file can grow arbitrarily, and that there should not be any read/write errors. Assert if there are.
Most of the effort for this project is in tasks 1 and 2. Tasks 3 and 4 are very specific optimizations to improve the performance of paging.
1. (30%) Implement demand paging. In this first task, you will continue to preallocate a physical page frame for each virtual page of each newly created process at exec time, just as in project 2. And as before, for now continue to return an error from the exec system call if there are not enough free page frames to hold the process' new address space. You will not yet need to implement the swap file, page replacement, an inverted page table, etc. Instead, you just need to make the following changes:
Add a method to prepare the requested page on demand. Note that faults on different pages are handled in different ways. A fault on a page in the COFF should read the corresponding page from the COFF file, and a fault on a stack page or arguments page should zero-fill the page (set every byte on the page to 0).
For this step, for reference look at the COFF file loading code from UserProcess.loadSections from project 2. If the process faults on page 0, for example, then load the first page of code from the executable file into it. More generally, when you handle a page fault you will use the value of the faulting address to determine how to initialize that page: if the faulting address is in a segment of the COFF file, then load the appropriate page; if it is any other page, zero-fill it. It is fine to loop through the sections of the COFF file until you find the approprate section and page to use (assuming it is in the COFF file).
Once you have paged in the faulted page, mark the TranslationEntry as valid. Then let the machine restart execution of the user program at the faulting instruction: return from the exception, but do not increment the PC (as is done when handling a system call) so that the machine will re-execute the faulting instruction. If you set up the page (by initializing it) and page table (by setting the valid bit) correctly, then the instruction will execute correctly and the process will continue on its way, none the wiser.
At this point we recommend implementing task 2 and testing it with programs that do not use files or the console, such as matmult, swap4, and swap5 (see "where and how to focus time" in the general tips).
Finally, come back and implement new VMProcess.readVirtualMemory and VMProcess.writeVirtualMemory methods to handle invalid pages and page faults. Start with your implementations from project 2, or create new ones, by implementing the methods for VMProcess. Both methods directly access physical memory to read/write data between user-level virtual address spaces and the Nachos kernel. These methods will now need to check to see if the virtual page is valid. If it is valid, it can use the physical page as before. If the page is not valid, then it will need to use the page fault handler to bring the page in as with any other page fault.
Testing: As long as there is enough physical memory to fully load a program, then you should be able to use test programs from project 2 to test this task of project 3. See the tips in the Testing section below for how you can control (increase or decrease) the number of physical pages (e.g., write10 is going to need more than the default of 16 pages). If you give Nachos enough physical pages, you can even run the swap4 and swap5 tests (and these tests do not use any system calls other than exit).
2. (58%) Now implement demand paged virtual memory with page replacement. In this second task, not only do you delay initializing pages, but now you delay the allocation of physical page frames until a process actually references a virtual page that is not already loaded in memory.
You can get the above two changes working without having page replacement implemented for the case where you run a single program that does not consume all of physical memory. Before moving on, be sure that the two changes above work for a single program that fits into memory.
Now implement page replacement to free up a physical page frame to handle page faults:
For task 2 use a simple locking strategy to protect your data structures used for demand paging: acquire a lock when you start handling a page fault, and release it when you are done handling the page fault. (Task 4 will optimize the use of this lock further.)
After getting paging with multiple processes working with the memory tests, then move on to readVirtualMemory and writeVirtualMemory. At this point you should use the same lock to protect each accessed page (e.g., so that between the time readVirtualMemory brings a page into memory and you use arrayCopy on that page, the page is not chosen by another process for eviction).
As you implement the above operations, keep the following points in mind:
Testing: Start with the memory-focused tests such as swap4 and swap5, and vary the amount of physical memory in Nachos to control how much demand paging each test must do. See the Testing section below for how to control the number of pages in physical memory.
3. (5%) In task 2, when evicting pages we ignored the dirty bit, i.e., we ignored whether or not the process had modified the page once it was brought into memory. As a result, when evicting a page we always wrote it to the swap file. In this task you will optimize page replacement by using the dirty bit. Consider a page P that has been written to the swap file and then read back into physical memory on a page fault. If P is again chosen for eviction, your implementation will only write P back to the swap file if P has been modified (i.e., if the dirty bit is true in its TranslationEntry). If P is not dirty, then it does not need to be written to swap since the version of P in memory is the same as the one in the swap file. Note that (1) this also means that you should not free up a page in the swap file for a given virtual page until the process terminates, and (2) you will need to modify writeVirtualMemory to set the dirty bit on the page being written to (since these writes are not being done by the emulated CPU, the CPU does not set the dirty bit on these writes as it does on all other writes).
After implementing this optimization, you should only do as many page reads and writes to the swap file as necessary to execute the program, and as dictated by the page replacement algorithm.
Note that tasks 3 and 4 do not depend on each other. You can implement task 3 before task 4, and vice versa.
Testing: Use the same tests as with the previous tasks. With this optimization, you should notice that the number of writes to the swap file (pageouts) decreases (sometimes significantly). In particular, you should notice a substantial difference in the number of pageouts between swap4 and swap5 since swap5 modifies the data in its memory.
4. (7%) In task 2, when handling a page fault we recommended using a simple locking strategy that acquires and holds a lock for the duration of your page fault handler. While simpler to implement, this strategy limits performance: when one process is handling a page fault — which may take milliseconds to complete — no other process can use the virtual memory subsystem (create new processes, handle their own page faults, etc.). In this task, you will optimize page replacement by allowing multiple processes to use the virtual memory subsystem in a more fine-grained manner. Note that this optimization is only useful once you have paging support for multiple processes working.
There are two aspects to implementing this optimization. The first is that you will need to modify the locking strategy from task 2: the process must release the lock before performing an I/O operation to the COFF file or swap file, and acquire it again when the I/O operation completes. Put another way, the requirement is that while a process performs an I/O operation to COFF or swap, the process cannot hold any locks.
The second is that when your code is using a physical page for copying data with readVirtualMemory or writeVirtualMemory, or when it is performing an I/O operation to the COFF file or the swap file, it will need to "pin" the physical page while using it so that no other process can choose it for eviction. When the I/O completes, the process will unpin the page. Consider the following actions:
In this example, the page to which A is writing should be pinned in memory so that it is not chosen for page eviction. Otherwise, if process B evicted the page, then when process A was rescheduled it would write to the page B loaded. This same scenario also applies if process A were evicting a page to the swap file (page out), or reading a page in from the swap file (page in). You can use your inverted page table to keep track of which physical pages are pinned.
Finally, it is possible that a process needs a page, but not only are all pages in use (meaning an eviction must occur), but all pages are pinned (meaning an eviction must not occur now ). Handle this situation using synchronization. If process A needs to evict a page, but all pages are pinned, block A on a condition variable. When another process unpins a page, it can unblock A. In terms of prioritizing, implement this functionality at the very end.
Here are two example test programs you can use:
The only system call that these programs use is exit (none of the system calls that use readVirtualMemory/writeVirtualMemory need to work). You can also use the two programs in the test directory that just compute on data in memory: matmult (which exits with 7220) and sort (which exits with 0). If you comment out the printf in matmult.c, then these programs also only use the exit system call.
$ nachos -m 64 -x swap4.coff
In contrast, when you start implementing swapping in task 2 you will want to explicitly control and reduce the number of pages (and thus, indirectly, increase the need for swapping). For example:
$ nachos -m 8 -x swap4.coffWhen first testing your swapping implementation (task 2), start with the largest memory allocation that still triggers swapping (i.e., one fewer page than what is needed to fit everything into physical memory). While your implementation should not depend upon any particular number of physical pages, we will always test with at least four physical pages. When testing with multiple processes, we will test with at least twice the number of physical pages as processes (e.g., eight pages with four processes).
Implement a method to print the state of the inverted page table (which processes are using which physical pages) and process page tables (which pages are in physical memory, which pages are in swap), and use it liberally when debugging. Lib.debug() and Lib.assertTrue() will also continue to be helpful.
When Nachos terminates it prints statistics about various events that happened during execution. There are four methods in the Machine class that you can use to record events and have their totals printed when Nachos terminates: incrNumCOFFReads, incrNumSwapReads, incrNumSwapWrites, incrNumSwapSkips (for task 3). These stats are optional, they are not used when grading.
As with projects 1 and 2, during the project period, you can also use Gradescope to run a snapshot of your code on the sample tests that we have given.
The procedure for submission is similar to that of projects 1 and 2, but has a noteable difference, as described below.
As a final step, create a file named README in the proj3 directory. The README file should list the members of your group and provide a short description of what code you wrote, how well it worked, how you tested your code, and how each group member contributed to the project. The goal is to make it easier for us to understand what you did as we grade your project in case there is a problem with your code, not to burden you with a lot more work. Do not agonize over wording. It does not have to be poetic, but it should be informative.
For grading project 3, we will support grading two different versions of your implementation and using the version with the higher score. The motivation for supporting this grading feature is to remove the risk of working on tasks 3 and 4. If a group completes task 2 and then moves on to tasks 3 and 4, but there are bugs (say) in task 4, then it can make grading task 2 harder (grading will encounter the bugs in task 4, masking what you implemented correctly for task 2). If a group does not have task 4 working, one option is to comment out the task 4 code before the deadline (e.g., comment out the lock release/acquire around reading/writing from swap, etc.). And that would be fine and it will work.
Instead, we will provide an alternate way to grade task 2 while also allowing you to work on tasks 3 and 4. By default we will grade the version of your repo at the deadline. In addition, groups can tag a commit in their repository to flag a version of their code for grading task 2 (or whatever represents your most stable version). If you tag a commit, then your score for project 3 will the higher of the deadline commit or the tagged commit. The basic command and tag to use is:
$ git tag proj3_task2
If you are still debugging tasks 3 and 4, you can temporarily change your latest version to just support task 2 (no dirty bit, no pinning) and then apply the tag (you can also checkout a commit that represents your most stable version and apply the tag to that). If you do not tag a commit for task 2, that is also fine, we will grade the latest version like we have done with the other projects. Or if you stop at task 2 and do not work on tasks 3 and 4 then you do not need to bother with a tag.
Important: We will be using Gradescope as the grading platform. Before the deadline, you must submit your code to Gradescope at least once to initialize the grading system for your project.
If you encounter problems with your account (command not found, disk quota exceeded, class file has wrong version, etc.), see these troubleshooting tips.
You can discuss concepts with students in other groups, but do not cheat when implementing your project. Cheating includes copying code from someone else's implementation, copying code from an implementation found on the Internet, or using generative AI or LLMs. All of these are a violation of academic integrity. See the main project page for more information.
We will manually check and also run code plagiarism tools on submissions and multiple Internet distributions (if you can find it, so can we).