# Paging

February 21 2007

## Project 2

I hope you've started. :) Any questions?

## Paging

The idea behind paging is simple: we define a constant called the "page size", and we partition our address space (both virtual and physical) into equal-sized pages, where each page holds an amount of data equal to the page size. So if my page size is 1K, every page holds 1K of data.

We create pages starting from address 0, up to the end of the address space. Page 0 always starts at address 0. Suppose my page size is 1K. Where does page 0 end? Where does page 1 start? Where does page N start and end? What page number does address 100 lie on? What about address 5000? Address X?

Pages are nice because it's really easy to determine which page an address lies on. Why are page sizes usually a power of 2?

## Page Tables

Abstractly, page tables just map virtual page numbers to physical page numbers. The different types of page tables may seem intimidating, but remember that they're still just translating virtual page numbers to physical page numbers. The additional complexity comes from trying to make the process more efficient.

Let's take a look at some of the optimizations, starting with the basics. Since a page table just maps from virtual page numbers (VPNs) to physical page numbers (PPNs), the most straightforward page table looks like this:

VPN PPN
0 1
1 3
2 4
3 7
... ...

The first idea is that we don't actually need to store the virtual page numbers in the table. Instead, we can use an array, and use the virtual page numbers as indices into the array. So we could represent the page table as the following:

```int pageTable[] = {1, 3, 4, 7, ...};
```

This new definition of pageTable provides exactly the same information as the earlier definition: pageTable[0] == 7, pageTable[1] == 3 and pageTable[2] == 4, and so on. So we can use this pageTable to translate VPNs to PPNs, just like before, but now we're not explicitly storing the VPNs. As an added bonus, we can very quickly determine the PPN for any VPN. Given a VPN, we can get its PPN in constant time - we don't have to do a linear scan through the page table.

### Two-Level Page Tables

Unfortunately, our page tables are still too big. How big are they? If we have a 32-bit address space, and our pages are 4K, we need 2^32 / 4K mappings, which is 1M mappings. Each mapping in our page table takes 4 bytes (sizeof(int)), so we need 4M for each page table.

The solution is to add another level of indirection (this is a very common trick in computer science). Instead of having one big table that maps from VPN to PPN, we will have two levels of tables. The first level table will point us to a second level table, and the second level table will tell us the PPN.

For a two-level page table, we split the VPN into two pieces: a primary VPN and a secondary VPN. We feed the primary VPN into the first level table, which points us to a second level table, then we feed the secondary VPN into the second level table, which gets us a PPN.

Note that the second level tables work a lot like our simple page tables (input: secondary VPN, output: PPN), but the first level page table is different (input: primary VPN, output: pointer to second level table)

So what does a two-level page table look like? Suppose I have a 1-bit secondary VPN:

```int pageTable[][2] = {{1, 3}, {4, 7}, ...};
```

Things aren't as simple as before, but all the data is still here, and I can still do all the translations I could do before.

For example, suppose I want to look up VPN 2. I split the VPN into a primary VPN and secondary VPN - since my secondary VPN is 1-bit, the secondary VPN is just the low bit of the VPN (which is 0), and the primary VPN is all the remaining bits of the VPN (which is 1). I feed the primary VPN into the first level table, and the secondary VPN into the second level table: pageTable[1][0] and the page table translates VPN 2 to PPN 4 as desired (pageTable[1][0] == 4)

How do I look up VPN 1 in this two level page table? And how are two-level page tables more efficient than single-level page tables? :)

### Extra Bits

The primary job of page tables is to convert VPNs into PPNs, but page tables also maintain several extra bits about each page, such as:

Valid
Do we actually have a PPN for this VPN?
Modified/Dirty
Has the page been modified? (written)
Reference
Has the page been accessed? (read or written)
Protection

Why do we bother tracking all these extra bits? When is each one useful?

Because the page table maintains these extra bits, most page tables do not actually map directly from VPNs to PPNs. Instead, they map from VPNs to "page table entries" (PTEs), where a PTE contains the physical page number and lots of extra bits. So a PTE looks something like this:

```struct pte {
int ppn;
bool valid;
bool dirty;
bool reference;
}
```

The corresponding structure in Nachos is TranslationEntry (see machine/translate.h).

With PTEs, we can still do translations just like before (since each PTE contains a physical page number), but now we have extra bits too.

## Demand Paging

With demand paging, we don't have to have all of a program's pages in memory at the same time. This is good because the amount of code and data a program uses at any time is typically a small fraction of the total code and data available.

Demand paging depends on the valid bit in PTEs. The valid bit indicates whether the virtual page has a corresponding physical page. If a program generates a virtual address that maps to an invalid PTE, a page fault is triggered, and we do the following:

1. Find an empty physical page
2. Read the requested page into the empty physical page
3. Write the physical page number into the PTE
4. Mark the PTE as valid
5. Tell the program to retry their load or store

Page faults can occur in any segment (code, data, stack, etc). This means that the page requested by the user program could be in a number of different places. Where might we read the requested page from?

## Questions

1. What's a TLB? What are they good for?
2. Suppose I have four physical pages of memory, and I am doing LRU page replacement. What sequence of virtual page accesses will cause a page fault on every access?
3. Is there any point to putting more than 4GB RAM in a 32-bit machine?
4. How many 32-bit address spaces can fit in a 64-bit address space? What do you think the page tables for a 64-bit address space look like?
5. When the processor is running user programs, the processor generates virtual addresses that must be translated to physical addresses before going to memory. How can processor caches fit into the picture? What are the tradeoffs involved?