Nachos Userspace Project, Memory Management

February 14 2007

Lecture Review

  1. Memory Management

Project 2

The version of Nachos you've been given runs a single user program, then shuts down. Your primary task in project 2 is to extend Nachos to run many user programs simultaneously.

This task is split into two parts:

  1. Memory Management: Nachos only knows how to load a single user program into fixed physical memory locations. You'll need to add the ability to load multiple user programs into different physical memory locations (depending on which pages are free)
  2. System calls: Nachos currently does not allow user programs to start more user programs. You'll need to implement the Exec system call (which is like a combined fork and exec in UNIX) so the first user program that Nachos starts for you can start more processes.

Note that these two parts are very dependent on each other. It is not easy to test each part individually. You will be tempted to code up these two parts, glue them together, fire up Nachos, and watch the fireworks. It is highly recommended that you test these parts individually to the best of your ability. You'll have to get creative, but it will be worth it... otherwise you'll be trying to debug two big pieces of untested code simultaneously.

You'll also need to add support for command line arguments, implement the Read and Write syscalls, and handle user program exceptions.

Nachos Userspace

Userspace programs in Nachos are MIPS binaries that run in a MIPS simulator. The Nachos kernel runs x86 instructions and uses memory provided by the underlying OS (Linux). Nachos userspace programs run simulated MIPS instructions and use memory provided by Nachos. The memory that Nachos provides to userspace programs is just an array of bytes (machine->mainMemory) in kernel space.

This stuff can get real confusing, so make sure you wrap your brain around this before you get started. In particular, make sure you understand the transformations that need to occur when a user program invokes Exec. Exec passes a char* to the kernel which names the program it wants to Exec. The user provides a virtual address to the kernel. The kernel must translate the virtual address to a physical address, and figure out what data is at that physical address. Note that data that is contiguous in a user process' virtual address space may not be contiguous in the physical address space.

You should also spend some time thinking about how data needs to move, and how addresses need to be translated to handle command line arguments. Recall that one user process can't read memory from another user process (unless you implement shared memory :). This means that command line arguments will be generated by one user process, and the kernel needs to copy them to the new user process.

Programming in userspace

You'll need to write Nachos userspace programs to verify that Nachos is working correctly. As mentioned before, Nachos userspace programs are MIPS binaries, so you'll be compiling your userspace programs with a special version of gcc that produces MIPS binaries. Userspace programs live in the test directory. The Makefile in that directory takes care of all the messy details of compiling a Nachos userspace program.

Nachos userspace is very, very primitive. It may take you a while to get accustomed to programming in Nachos userspace. The only provided syscall is Halt, so userspace will be severely limited in functionality until you implement more system calls.

You'll be writing all your Nachos userspace programs in C (not C++). Furthermore, the C standard library will not be available (malloc, printf, scanf, strlen, strcmp, etc). Why? The C standard library is built on UNIX system calls. The UNIX write syscall is very different from the Nachos Write syscall, for example. You could port the C standard library to run on Nachos, but it'd be a whole lot of work. You're better off just re-implementing the parts you need.

Also, Nachos userspace programs don't have a heap (unless you build one :), so you'll need to use the stack for all dynamic allocations in userspace programs.

Starting Points

Project 2 builds on the code you wrote in project 1. You will be using the threads and synchronization primitives you built in project 1.

You'll find the following files in the userprog directory:

addrspace.h, addrspace.cc
Study everything here. You'll be making changes here to support multiprogramming. addrspace.cc contains many calls to code in the machine directory. You'll want to take a look at some of the files in the machine directory (I'll provide some pointers below), but you shouldn't change anything in machine unless specifically told to.
bitmap.h, bitmap.cc
You will probably find this useful (like the List class from project 1). Study the interface, but don't worry about the details.
exception.cc
The Nachos exception handler. This is the entry point for all system calls. You'll need to extend this to add support for more system calls.
progtest.cc
This file contains StartProcess, which starts a process. Figuring out how it works will help you implement Exec.

You'll find the following files in the machine directory (you shouldn't be changing anything in this directory unless specifically told to):

translate.h, translate.cc
You'll want to understand how most of this code works. In particular, take a look at Machine::Translate, Machine::ReadMem, and Machine::WriteMem.
machine.h, machine.cc
Study the interfaces and check out the public variables in the Machine class (machine->mainMemory, machine->pageTable).
console.h, console.cc
Study the interface. You'll need to talk to the console to implement the Read and Write syscalls.

Questions

  1. Is malloc privileged?
  2. If I have a 32-bit machine, and the page size is 4K, how many bits are allocated for virtual page numbers? What determines the number of entries in a page table? How is a page table used to convert virtual addresses to physical addresses?
  3. When the processor is running user programs, the processor will generate virtual addresses that must be translated to physical addresses before going to memory. How can processor caches fit into the picture? What are the tradeoffs involved?
  4. Draw a diagram that shows how internal fragmentation can be a problem when processes are assigned to fixed-size memory partitions
  5. Draw a diagram that shows how external fragmentation can be a problem when processes are assigned to variable-size memory partitions