Project 3: Demand Paging

Fall 2024

Due: Friday, December 6 at 11:59pm (No time for extensions or late submissions)

The last project! In project 2, each process had a page table that was initialized with physical pages and their contents when the process was created. In project 3, you will implement a more sophisticated memory management system where physical pages are allocated on demand and pages that cannot fit in physical memory will be stored on disk.

Background

You will implement and debug virtual memory in two main steps. First, you will implement demand paging using page faults to dynamically initialize process virtual pages on demand, rather than initializing page frames for each process in advance at exec time as you did in project 2. Next, you will implement page replacement, enabling your kernel to evict a virtual page from memory to free up a physical page frame to satisfy a page fault. Demand paging and page replacement together allow your kernel to "overbook" memory by executing more processes than would fit in machine memory at any one time, using page faults to multiplex the available physical page frames among the larger number of process virtual pages. When implemented correctly, virtual memory is undetectable to user programs unless they monitor their own performance.

You project will implement the following functionality:

  1. Demand Paging. Pages will be in physical memory only as needed. When no physical pages are free, it is necessary to free pages, possibly evicting pages to swap.
  2. Lazy Loading. To fulfill the spirit of demand paging, processes should load no pages when started, and depend on demand paging to provide even the first instruction they execute. When you are done, loadSections will not allocate even a single page.
  3. Page Pinning. At times it will be necessary to "pin" a page in memory, making it temporarily impossible to evict.

The changes you make to Nachos will be in these two files in the vm directory:

These classes inherit from UserKernel and UserProcess. While the VM versions of these classes will be able to depend upon functionality in the base classes, the focus in this project will be on demand-paged virtual memory. As a result, you will be implementing new versions of key methods in VMProcess such as loadSections, readVirtualMemory, and writeVirtualMemory.

You will compile and run the project in the proj3 directory. Unlike the first two projects, you will not need to learn any new Nachos modules and will continue to use functionality that you became familiar with in project 2. Before starting your implementation, also see the Tips section below.

Design Aspects

Central to this project are the following design aspects:

  1. TranslationEntry bits. You will extend your kernel's handling of the page tables to use three special bits in each TranslationEntry (TE):

  2. Swap File. To manage swapped out pages on disk, use the StubFileSystem (via ThreadedKernel.fileSystem) as in project 2. There are many design choices, but we suggest using a single, global swap file across all processes. This file should last the lifetime of your kernel. Be sure to choose a reasonably unique file name that will not conflict with other files in the test directory. When designing the swap file, keep in mind that the units of swap are pages. Thus you should be efficient with disk space using the same techniques applied in virtual memory: any gaps in your swap space due to processes terminating should be used by future processes. As with physical memory in project 2, a global free list works well. You can assume that the swap file can grow arbitrarily, and that there should not be any read/write errors. Assert if there are.

  3. Global Memory Accounting. In addition to tracking free pages (which may be managed as in project 2), there are now two additional pieces of memory information relevant to all processes: which processes own which pages (part 2), and which pages are pinned (part 4). The former is necessary to manage eviction of pages, and the latter is necessary when optimizing page fault handling to use the virtual memory subsystem in a more fine-grained manner. There are also multiple approaches to solving this problem, but we suggest using a global inverted page table (see the tips below).

Tasks

Most of the effort for this project is in parts 1 and 2. Parts 3 and 4 are very specific optimizations to improve the performance of paging.

1. (30%) Implement demand paging. In this first part, you will continue to preallocate a physical page frame for each virtual page of each newly created process at exec time, just as in project 2. And as before, for now continue to return an error from the exec system call if there are not enough free page frames to hold the process' new address space. The only special bit in TranslationEntries that you need to use for this part is the valid bit. You will not yet need to implement the swap file, page replacement, an inverted page table, etc. Instead, you just need to make the following changes:

  1. In VMProcess.loadSections, initialize all of the TranslationEntries as invalid. This will cause the machine to trigger a page fault exception when the process accesses a page. Also do not initialize the page by, e.g., loading from the COFF file. Instead, you will do this on demand when the process causes a page fault. As a result, loadSections will continue to allocate physical page frames in the page table for each virtual page, but delay loading the frames with content until they are actually referenced by the process. Note that handling a page fault does not have a return value.

  2. Handle page fault exceptions via VMProcess.handleException. When the process references an invalid page, the machine will raise a page fault exception (if a page is marked valid, no fault is generated). Modify your exception handler to catch this exception and handle it by preparing the requested page on demand. The Processor class lists all of the exceptions and associated registers that the MIPS CPU can generate, and it has one of each for page faults.

  3. Add a method to prepare the requested page on demand. Note that faults on different pages are handled in different ways. A fault on a page in the COFF should read the corresponding code page from the COFF file, and a fault on a stack page or arguments page should zero-fill the page (set every byte on the page to 0). For this step, for reference look at the COFF file loading code from UserProcess.loadSections from project 2. If the process faults on page 0, for example, then load the first page of code from the executable file into it. More generally, when you handle a page fault you will use the value of the faulting address to determine how to initialize that page: if the faulting address is in a segment of the COFF file, then load the appropriate page; if it is any other page, zero-fill it. It is fine to loop through the sections of the COFF file until you find the approprate section and page to use (assuming it is in the COFF file).

    Once you have paged in the faulted page, mark the TranslationEntry as valid. Then let the machine restart execution of the user program at the faulting instruction: return from the exception, but do not increment the PC (as is done when handling a system call) so that the machine will re-execute the faulting instruction. If you set up the page (by initializing it) and page table (by setting the valid bit) correctly, then the instruction will execute correctly and the process will continue on its way, none the wiser.

  4. At this point we recommend doing task 2 for test programs that do not use files or the console, such as matmult, swap4 and swap5 (see "where and how to focus time" in the general tips). Then come back and implement new VMProcess.readVirtualMemory and VMProcess.writeVirtualMemory methods to handle invalid pages and page faults. Start with your implementations from project 2, or create new ones, by implementing the methods for VMProcess. Both methods directly access physical memory to read/write data between user-level virtual address spaces and the Nachos kernel. These methods will now need to check to see if the virtual page is valid. If it is valid, it can use the physical page as before. If the page is not valid, then it will need to use the page fault hander to bring the page in as with any other page fault.

Testing: As long as there is enough physical memory to fully load a program, then you should be able to use test programs from project 2 to test this part of project 3. See the tips in the Testing section below for how you can control (increase or decrease) the number of physical pages (e.g., write10 is going to need more than the default of 16 pages). If you give Nachos enough physical pages, you can even run the swap4 and swap5 tests (and these tests do not use any system calls other than exit).

2. (58%) Now implement demand paged virtual memory with page replacement. In this second part, not only do you delay initializing pages, but now you delay the allocation of physical page frames until a process actually references a virtual page that is not already loaded in memory.

  1. In part one for VMProcess.loadSections, you allocated physical pages for each virtual page, but you marked them as invalid so that they would be initialized on a page fault. Now change VMProcess.loadSections so that it does not even allocate a physical page. Instead, merely mark all the TranslationEntries as invalid.
  2. Extend your page fault exception handler to allocate a page frame on-the-fly when a page fault occurs. In part one, you just initialized the contents of the virtual page when a page fault occurred. In this part, now allocate a physical page for the virtual page and use your code from part 1 above to initialize it, mark the TranslationEntry as valid, and return from the exception.

You can get the above two changes working without having page replacement implemented for the case where you run a single program that does not consume all of physical memory. Before moving on be sure that the two changes above work for a single program that fits into memory.

Now implement page replacement to free up a physical page frame to handle page faults:

  1. Extend your page fault exception handler to evict pages once physical memory becomes full. First, you will need to select a victim page to evict from memory. Your page eviction strategy should be the clock algorithm as described in lecture. For this part, use the used bit in TranslationEntries to track when pages are used by a process.

  2. Evict the victim page to the swap file and mark the TranslationEntry for that page as invalid.

  3. Read in the contents of the faulted page (more below).

  4. Implement the swap file for storing pages evicted from physical memory. You will want to implement methods to create a swap file, write pages from memory to swap (for page out), read from swap to memory (for page in), etc.

  5. Implement an inverted page table (see the tips below) to keep track of which virtual pages are using which physical pages.
First implement this paging functionality for a single process. Once that works, extend your implementation to support multiple processes:

  1. For part 2 use a simple locking strategy to protect your data structures used for demand paging: acquire a lock when you start handling a page fault, and release it when you are done handling the page fault. (Part 4 will optimize the use of this lock further.)

    After getting paging with multiple processes working with the memory tests and you then move on to readVirtualMemory and writeVirtualMemory, you should use the same lock to protect each page accessed in their while loops (e.g., so that between the time rVM brings a page into memory and you use arrayCopy on that page, the page is not chosen by another process for eviction).

  2. Modify your implementation for cleaning up a process when it terminates by (1) only freeing physical pages that have been allocated to it and (2) freeing all swap pages allocated by the process.

  3. Optional: We will not be testing the arguments to exec or any of the join functionality. However, if you have that functionality from project 2 and you would like to have it work for project 3, note that your implementation will need to check the validity of the arguments page (for exec) and the validity of the page with the status argument to join.

As you implement the above operations, keep the following points in mind:

Testing: Start with the memory focused tests such as swap4 and swap5, and vary the amount of physical memory in Nachos to control how much demand paging each test must do. See the Testing section below for how to control the number of pages in physical memory.

3. (5%) In part 2, when evicting pages we ignored the dirty bit, i.e., we ignored whether or not the process had modified the page once it was brought into memory. As a result, when evicting a page we always wrote it to the swap file. In this part you will optimize page replacement by using the dirty bit. Consider a page P that has been written to the swap file and then read back into physical memory on a page fault. If P is again chosen for eviction, your implementation will only write P back to the swap file if P has been modified (i.e., if the dirty bit is true in its TranslationEntry). If P is not dirty, then it does not need to be written to swap since the version of P in memory is the same as the one in the swap file. Note that (1) this also means that you should not free up a page in the swap file for a given virtual page until the process terminates, and (2) you will need to modify writeVirtualMemory to set the dirty bit on the page being written to (since these writes are not being done by the emulated CPU, the CPU does not set the dirty bit on these writes as it does on all other writes).

After implementing this optimization, you should only do as many page reads and writes to the swap file as necessary to execute the program, and as dictated by the page replacement algorithm.

Note that parts 3 and 4 do not depend on each other. You can implement part 3 before part 4, and vice versa.

Testing: Use the same tests as with the previous parts. With this optimization, you should notice that the number of writes to the swap file (pageouts) decreases (sometimes significantly). In particular, you should notice a substantial difference in the number of pageouts between swap4 and swap5 since swap5 modifies the data in its memory.

4. (7%) In part 2, when handling a page fault we recommended using a simple locking strategy that acquires and holds a lock for the duration of your page fault handler. While simpler to implement, this strategy limits performance: when one process is handling a page fault — which may take milliseconds to complete — no other process can use the virtual memory subsystem (create new processes, handle their own page faults, etc.). In this part, you will optimize page replacement by allowing multiple processes to use the virtual memory subsystem in a more fine-grained manner. Note that this optimization is only useful once you have paging support for multiple processes working.

There are two aspects to implementing this optimization. The first is that you will need to modify the locking strategy from part 2: the process must release the lock before performing an I/O operation to the COFF file or swap file, and acquire it again when the I/O operation completes. Put another way, the requirement is that while a process performs an I/O operation to COFF or swap, the process cannot hold any locks.

The second is that when your code is using a physical page for copying data with readVirtualMemory or writeVirtualMemory, or when it is performing an I/O operation to the COFF file or the swap file, it will need to "pin" the physical page while using it so that no other process can choose it for eviction. When the I/O completes, the process will unpin the page. Consider the following actions:

  1. Process A is executing the program at user-level and invokes the read system call.
  2. Process A enters the kernel, and is part way through writing to user memory using writeVirtualMemory.
  3. A timer interrupt triggers a context switch, entering process B.
  4. Process B immediately generates numerous page faults, which in turn cause pages to be evicted from other processes, including some used by process A.
  5. Process B loads its contents into the page that process A was using in writeVirtualMemory.
  6. Eventually, process A is scheduled to run again, and continues handling the read syscall as before.

In this example, the page to which A is writing should be pinned in memory so that it is not chosen for page eviction. Otherwise, if process B evicted the page, then when process A was rescheduled it would write to the page B loaded. This same scenario also applies if process A were evicting a page to the swap file (page out), or reading a page in from the swap file (page in). You can use your inverted page table to keep track of which physical pages are pinned.

Finally, it is possible that a process needs a page, but not only are all pages in use (meaning an eviction must occur), but all pages are pinned (meaning an eviction must not occur now ). Handle this situation using synchronization. If process A needs to evict a page, but all pages are pinned, block A on a condition variable. When another process unpins a page, it can unblock A. In terms of prioritizing, implement this functionality at the very end.

Note that parts 3 and 4 do not depend on each other. You can implement part 4 before part 3, and vice versa.

Tips

Testing

Important: As with the previous projects, before the deadline you must submit your code to Gradescope at least once to initialize the grading system for your project.

Code Submission

As a final step, create a file named README in the proj3 directory. The README file should list the members of your group and provide a short description of what code you wrote, how well it worked, how you tested your code, and how each group member contributed to the project. The goal is to make it easier for us to understand what you did as we grade your project in case there is a problem with your code, not to burden you with a lot more work. Do not agonize over wording. It does not have to be poetic, but it should be informative.

For grading project 3, we will support grading two different versions of your implementation and using the version with the higher score. The motivation for supporting this grading feature is to remove the risk of working on tasks 3 and 4. If a group completes task 2 and then moves on to tasks 3 and 4, but there are bugs (say) in task 4, then it can make grading task 2 harder (grading will encounter the bugs in task 4, masking what you implemented correctly for task 2). If a group does not have task 4 working, one option is to comment out the task 4 code before the deadline (e.g., comment out the lock release/acquire around reading/writing from swap, etc.). And that would be fine and it will work.

Instead, we will provide an alternate way to grade task 2 while also allowing you to work on tasks 3 and 4. By default we will grade the version of your repo at the deadline. In addition, groups can tag a commit in their repository to flag a version of their code for grading task 2 (or whatever represents your most stable version). If you tag a commit, then your score for project 3 will the higher of the deadline commit or the tagged commit. The basic command and tag to use is:

$ git tag proj3_task2

If you are still debugging parts 3 and 4, you can temporarily change your latest version to just support task 2 (no dirty bit, no pinning) and then apply the tag (you can also checkout a commit that represents your most stable version and apply the tag to that). If you do not tag a commit for task 2, that is also fine, we will grade the latest version like we have done with the other projects. Or if you stop at task 2 and do not work on tasks 3 and 4 then you do not need to bother with a tag.

Important: Before the deadline, you must submit your code to Gradescope at least once.

Troubleshooting Account Issues

If you encounter problems with your account (command not found, disk quota exceeded, class file has wrong version, etc.), see these troubleshooting tips.

Cheating

You can discuss concepts with students in other groups, but do not cheat when implementing your project. Cheating includes copying code from someone else's implementation, copying code from an implementation found on the Internet, or using generative AI or LLMs. See the main project page for more information.

We will manually check and also run code plagiarism tools on submissions and multiple Internet distributions.