 |
|  |
 |
 |  |  |
 |
Nachos Project 2: Multiprogramming
Due: Thursday, November 10th, at Midnight.
The second phase of Nachos is to support multiprogramming. As in
the first assignment, we give you some of the code you need and your
job is to complete the system and enhance it.
Up to now, all the code you have written for Nachos has been part
of the operating system kernel. In a real operating system the kernel
not only uses its procedures internally, but also allows user-level
programs to access some of its routines via system calls. An
executing user program is a process. In this project you will modify
Nachos to support multiple processes, using system calls to have
processes request services from the kernel.
Since your kernel does not trust user programs to execute safely,
the kernel and the (simulated) hardware will work together to protect
the system from damage by malicious or buggy user programs. To this
end, you will implement simple versions of key mechanisms found in
real operating system kernels: virtual addressing, protected system
calls and kernel exception handling, and preemptive
timeslicing. Virtual addressing prevents user processes from accessing
kernel data structures or the memory of other programs; your kernel
will use process page tables to safely allow multiple processes to
reside in memory at the same time. With protected system calls and
exceptions, all entries into the kernel funnel through a single kernel
routine, "ExceptionHandler"; you will ''bullet-proof'' this routine so
that buggy or malicious user programs cannot cause the kernel to crash
or behave inappropriately. Finally, your kernel will use preemptive
scheduling to share the simulated CPU fairly among the active user
processes, so that no process can take over the system. All of these
protection mechanisms require cooperation between the hardware and the
operating system kernel software. Your implementation will be based on
"hardware" support in the Nachos MIPS simulator, which resembles a
real MIPS processor.
If all processes are created by other processes, then who creates
the first user process? The operating system kernel creates this
process itself as part of its initialization sequence. This is
bootstrapping. You can "boot" the Nachos kernel by running nachos
with the -x option (x for "execute"), giving the name of an initial
program to run as the initial process. The Nachos release implements
the -x option by calling StartProcess in progtest.c to handcraft the
initial process and execute the initial program within it. The initial
process may then create other processes, which may create other
processes...and so on.
The first step is to read and understand the part of the system we
have written for you. The code that you will have to look at and
modify is spread over several directories. There are some kernel files
in "userprog", a few additional machine simulation files in "machine",
and a stub file system in "filesys". The user programs are in "test",
and utilities to generate a Nachos loadable executable are in
"bin". Since Nachos executes MIPS instructions (and there aren't very
many MIPS machines left!), we also provide you with a cross-compiler. The
cross-compiler runs on Linux and Solaris and compiles user programs
into MIPS format.
The code we provide can run only a single user-level "C" program at
a time. As a test case, we've provided you with a trivial user
program, "halt". All "halt" does is to turn around and ask the
operating system to shut the machine down. To run the "halt" program,
make and then run Nachos in the "userprog" directory:
% cd userprog
% ./nachos -x ../test/halt
Trace what happens as the user program gets loaded, runs, and
invokes a system call.
As with the previous assignment, you may find Narten's Road Map to
Nachos and Nachos
System Call Interface helpful. Also, you can change the
constants in "machine.h" if it helps you design better test
cases. For example, you can change the amount of "physical memory"
allocated by Nachos. Note that the code for this part of the project
is enabled by the USER_PROGRAM compiler macro.
The files for this assignment include:
- progtest.cc
- test routines for running user programs.
- addrspace.h,
addrspace.cc create an address space in which to run a user program,
and load the program from disk.
- syscall.h
the system call interface: kernel procedures that user programs can
invoke.
- exception.cc
the handler for system calls and other user-level exceptions, such as
page faults. In the code we supply, only the halt system call is
supported.
- bitmap.h,
bitmap.cc -- routines for manipulating bitmaps (this might be useful for
keeping track of physical page frames)
- filesys.h,
openfile.h (found in the filesys directory) -- a stub defining the
Nachos file system routines. For this project, we have implemented the
Nachos file system by directly making the corresponding calls to the UNIX
file system; this is so that you need to debug only one thing at a time.
- translate.h,
translate.cc -- translation table routines. In the code we supply,
we assume that every virtual address is the same as its physical address --
this restricts us to running one user program at a time. You will
generalize this to allow multiple user programs to be run concurrently.
We will not ask you to implement virtual memory support until project
3; for now, every page must be in physical memory.
- machine.h,
machine.cc -- emulates the part of the machine that executes user
programs: main memory, processor registers, etc.
- mipssim.cc-- emulates the integer instruction set of a MIPS R2/3000 processor.
- console.h,
console.cc -- emulates a terminal device using UNIX files. A terminal is
(i) byte oriented, (ii) incoming bytes can be read and written at the same
time, and (iii) bytes arrive asynchronously (as a result of user
keystrokes), without being explicitly requested.
In this assignment we are giving you a simulated CPU that models a
real CPU. In fact, the simulated CPU is the same as the real CPU (a
MIPS chip), but we cannot just run user programs as regular UNIX
processes because we want complete control over how many instructions
are executed at a time, how the address spaces work, and how
interrupts and exceptions (including system calls) are handled.
Our simulator can run normal programs compiled from C -- see the
Makefile in the 'test' subdirectory for an example. The compiled
programs must be linked with some special flags, and then converted
into Nachos format using the program ''coff2noff'' (which we
supply). The only caveat is that floating-point operations are not
supported.
- [25 pts] Implement system call handling and multiprogramming. To
support multiprogramming, you must support some of the system calls
defined in syscall.h: Exec, and Exit. You can also
implement thread fork and yield for extra credit (see below). We have
provided you an assembly-language routine, ''syscall'', to provide a
way of invoking a system call from a C routine (UNIX has something
similar -- try 'man syscall'). Note that the routine 'StartProcess' in
progtest.cc may be of use to you in implementing the `exec' system
call.
First, you will need basic facilities to load processes into the
memory of the simulated machine. Spend a few minutes studying the
AddrSpace class, and look at how the StartProcess procedure uses the
AddrSpace class methods to create a new process, initialize its memory
from an executable file, and start the calling thread running user
code in the new process context. The current code for AddrSpace and
StartProcess works OK, but it assumes that there is only one
program/process running at a time (started with StartProcess from main
via the nachos -x option), and that all of the machine's memory is
allocated to that process. Your first job is to generalize this code
to implement the Exec system call for the general case in which
multiple processes are active simultaneously.
Implement a memory manager module to allow your kernel to allocate
page frames of the simulated machine's memory for specific processes,
and to keep track of which frames are free and which are in use
(cf. bitmap.h). You might want to implement a
thread-safe Table class that will help you with your
implementation of AddrSpace. The Table class will store a collection
of untyped object pointers indexed by integers in the range
[0...size-1], and support the following methods:
/* Create a table to hold at most "size" entries. */
Table(int size)
/* Allocate a table slot for "object", returning the
"index" of the allocated entry; otherwise, return -1
if no free slots are available. */
int Alloc(void *object)
/* Retrieve the object from table slot at "index", or NULL
if that slot has not been allocated. */
void *Get(int index)
/* Free the table slot at index. */
void Release(int index)
Modify AddrSpace to allow multiple processes to be resident in the
machine memory at the same time. The default AddrSpace constructor
code assumes that all of the machine memory is free, and it loads the
new process contiguously starting at page frame 0. You must modify
this scheme to use your memory manager to allocate page frames for the
new process, and load the process code and data into those allocated
page frames (which are not typically contiguous). For now it is
acceptable to fail and return an error (0) from Exec if there is not
enough free total machine memory to load the executable file.
Note: to cleanly handle these failures, you will need to move the
AddrSpace loading code out of the constructor and into a new AddrSpace
method that can return an error. It is poor programming practice to
put code that can fail into a class constructor, as the Nachos
designers have done in this release. If necessary, update
StartProcess to use your new AddrSpace interface so that you do not
break the Nachos -x option. Modify AddrSpace to call the memory
manager to release the pages allocated to a process when the process
is destroyed. Make sure your AddrSpace code also releases any frames
allocated to the process in the case where it discovers that it does
not have enough memory to load the entire process.
[Geoff Voelker's Tips on getting started]
- array.c -- sample program testing loading code and data (note the use of Exit)
- [25 pts] Next, use these new facilities to implement the Exec and Exit
system calls as defined in the Nachos
System Call Interface and in userprog/syscall.h. If an executing
user processes requests a system call (by executing a trap
instruction) the machine will transfer control to your kernel by
calling ExceptionHandler in exception.cc. Your kernel code must
extract the system call identifier and the arguments from the machine
registers, decode them, and call internal procedures that implement
the system call. You may impose a reasonable limit on the maximum
size of a file name. Also, use of Machine:ReadMem and Machine:WriteMem
is not forbidden as the comment in machine.h implies.
When an Exec call returns, your kernel should have created a new
process and started a new thread executing within it to run the
specified program. At this point, do not concern yourself with setting
up OpenFileIds. For now, you will be able to run user programs, but
they will not be able to read any input or write any output. For
Exec, you must copy the filename argument from user memory into kernel
memory safely, so that a malicious or buggy user process cannot crash
your kernel or violate security. The filename string address (char*)
passed into the kernel as an argument is a process virtual address; in
order for the kernel to access the filename it must locate the
characters in the kernel address space (i.e., in the machine's
physical "main memory" array) by examining the page table for the
process. In particular, your kernel must handle the case where the
filename string crosses user page boundaries and resides in
noncontiguous physical memory. You must also detect an illegal string
address or a string that runs off the end of the user's address space
without a terminating null character. Your kernel should handle these
cases by returning an error (SpaceId 0) from the Exec system call.
Exec must return a unique process identifier (SpaceId), which can
be used as an argument to Join. Your kernel will need to keep a table
of active processes, and you may find the Table class described above
useful here, too. You may find it convenient to implement process
exit as an internal kernel procedure called by the Exit system call
handler, rather than calling the lower-level procedures directly from
ExceptionHandler. This will make it easier to kill a process from
inside the kernel (e.g., if the process has some kind of fatal error),
by calling the internal exit primitive from another kernel procedure
(e.g., ExceptionHandler) in the target process context. In general,
this kind of careful internal decomposition will save you from
reinventing and redebugging wheels, and it is always good practice.
In your Exit system call hander, print out the status value
passed as the parameter. This will aid in debugging and testing.
As with the Memory Manager, you can declare a global pointer to
the process table and initialize it in StartProcess (since
StartProcess only runs once, and runs before any processes start).
[10 pts] Write test programs that test multiprogramming: Exec,
Exit, and your AddrSpace implementation. You should make sure that
Exec cleanly handles problem cases where the program string argument
is invalid for some reason -- bad string address, does not end in a
null character, and specifies a filename that does not exist. You
should also test the boundaries of AddrSpace -- trying load a program
that doesn't fit into physical memory, being able to load many
programs one after the other (i.e., AddrSpace releases memory when a
process goes away so it can be reused again).
[More of Voelker's tips on getting started]
- [10 pts] The current version of the "exec" system call does not
provide any way for the user program to pass parameters or arguments
to the newly created address space. UNIX does allow this, for
instance, to pass in command line arguments to the new address
space. Implement this feature.
There are two steps to get this done. First, you need a new
signature for Exec so that you can give it the arguments to be passed
to the new process. Second, you need to figure out how to pass those
arguments to that process.
1. Change the signature of Exec to be the following:
SpaceID Exec(char *name, int argc, char **argv, int willJoin);
"name" is the filename as before, "argc" is the number of
arguments, "argv" is the array of argument strings, and "willJoin"
determines whether this process will be joined by another. The
semantics of argc and argv are exactly like standard C. The semantics
of the "willJoin" argument are the same as for Nachos threads --
non-zero means that a process will eventually Join on the one being
Exec'ed. If you do not implement the Join system call, you should
pass in "0" for this argument.
You will need to change the declaration of Exec in syscall.h, and
you will need to change your test programs to use this new version of
Exec. If a program does not pass in arguments to the new process,
just pass in zero for argc, argv, and willJoin.
2. To pass in the arguments to the created process, you need to do
two things. First, you need to copy the arguments into the process'
address space. Second, you need to setup the machine so that, when
the process starts executing, the argc and argv arguments are passed
into the main() routine of the program.
NOTE: When you construct the arguments to the child process,
include as the first argument (argv[0]) the program name (the "name"
argument passed as the first argument to Exec). This follows C
conventions.
Before you can copy in the arguments, you need to create some
space for them. To do this, grow the process' virtual address space
by some amount (e.g., just like the address space is grown to
accomodate the user stack). You decide how much (and document this in
your writeup), and make sure that you check that the arguments do not
overflow this region. Don't make it too large, though, or you will
consume too much physical memory. Then copy the arguments into that
region of the virtual address space (this is the opposite of reading
system call arguments out of the virtual address space).
To setup the machine so that the process is started with the
arguments, you can assume that the arguments to main use the same set
of registers as the arguments passed in from system calls (i.e., same
calling convention). The comments at the top of ExceptionHandler
describe which registers to use.
NOTE: To test this, you will need to create a test program that
has arguments in its main routine:
int
main(int argc, char *argv[])
{
// e.g., check to make sure that the arguments are valid...
}
You do not need to support command line arguments to the first
process that is started when nachos starts, just processes that are
Exec()ed.
- [20 pts] Implement the "read" and "write" system
calls for console reads and writes (not for file reads and writes).
To support the system calls that access the console device, you will
probably find it helpful to implement a ''SynchConsole'' class that
provides the abstraction of synchronous access to the console (i.e., a
read on the SynchConsole blocks the thread until a character is
available). ''progtest.cc'' has the beginnings of a SynchConsole
implementation. You might find it useful to peek into the
filesys/synchdisk.{cc,h} files to use the SynchDisk class as a further
example.
An example program that uses Read and Write that is slightly less
complicated than shell is echo.
- [10 pts] Bullet-proof your kernel. Implement the Nachos kernel
code to handle user program exceptions that are not system calls. The
machine (or simulator) raises an exception whenever it is unable to
execute the next user instruction, e.g., because of an attempt to
reference an illegal address, a privileged or illegal instruction or
operand, or an arithmetic underflow or overflow condition. The
kernel's role is to handle these exceptions in a reasonable way, i.e.,
by printing an error message and killing the process rather than
crashing the whole system. Note: an ASSERT that crashes Nachos is a
reasonable response to a bug within your Nachos kernel, but it not an
acceptable response to a user program exception.
Create user-level test programs to verify that your exception
handling code works properly, both for a single process and for
Exec'ed processes. You do not have to create programs to test every
exception -- AddressErrorException and IllegalInstrException are
sufficient.
- (Extra Credit) [4 pts] Implement the "join" system call. Join
takes one argument, the SpaceId of the process that the caller wants
to wait for. The process to be Joined must have been Exec'ed with its
willJoin argument set to non-zero. Join returns the process status
(Exit) code of the process that was Joined. Return -65535 from Join
to signify an error (e.g., the SpaceId passed to the Join system call
is invalid). Demonstrate that your implementation of Join works
correctly with test programs.
- (Extra Credit) [4 pts] Implement interprocess communication using
pipes (requires support for "read" and "write" system calls). This
will redirect the output of one process to the input of another.
Since you are not required to implement file reads and writes, this
will be the only way to pass information to a process, aside from
command line arguments. An example of a unix pipe would be "program1
| program2". In this example, program1 would normally output text to
the standard output, but the pipe will redirect this output to the
input of program2. program2 would normally take its input from a
file. This problem will require some imagination since files do not
exist, but should not be too difficult to implement.
When implementing pipes, implement them according to the
specification in the Duke Nachos pages: Pipes.
The only twist is that we are going to have to overload the "willJoin"
parameter to Exec to pass in values for "pipectrl". Change your Exec
implementation to interpret willJoin as follows:
willJoin & 0x1: child will be Joined
willJoin & 0x2: bind stdout to a pipe
willJoin & 0x4: bind stdin to a pipe
In other words, for a child that will not be Joined willJoin=0x2
corresponds to pipectrl=1, willJoin=0x6 corresponds to pipectrl=2, and
willJoin=0x4 corresponds to pipectrl=3.
Note: You do not need to support pipes for the first process that
starts up with Nachos.
- (Extra Credit) [4 pts] Implement multithreaded user
programs. Implement the thread fork() and yield() system calls to
allow a user program to fork a thread to call a routine in the same
address space, and then ping-pong between threads. (Hint: you
will need to change the kernel to allocate memory in the user address
space for each thread stack.)
Submitting The Project
As with project 1, we would like a short paper write-up describing
what changes you made, how well they worked, how you tested your
changes, and how each group member contributed to the project. Please
include this writeup in your userprog directory. As with
project 1, use CVS to turn in your project:
% cvs rtag -F project2 nachos-3.4
 |  |
 | back to top ^
 |
 |
|  |