CSE 221: Homework 2

Winter 2017

Hardcopy due Thursday, February 23 at the start of class

Answer the following questions. For questions asking for short answers, there may not necessarily be a "right" answer, although some answers may be more compelling and/or much easier to justify. But I am interested in your explanation (the "why") as much as the answer itself. Also, do not use shorthand: write your answers using complete sentences.

When grading homeworks, we will grade one question in detail and assign full credit for technical answers to the others.

  1. Exokernel and L4 represent contemporary approaches for providing protection and extensibility. Xen represents a contemporary approach for providing virtualization and isolation (or, alternately, is an extreme version of extensibility since it goes even beyond Exokernel in exposing the hardware interface to unprivileged code). Consider a Web server as a motivating application-level service running on each of these three system structures, each hosting the OS described in the paper.

    For each of the three systems, consider the path a network packet containing an HTTP request takes as it travels from the network interface card to a Web server process running at user level:

    1. Identify the various protection domains in the system for this scenario. Which domains are privileged, and which are unprivileged? (Feel free to draw "boxes-and-kernel-boundary" diagrams if you find them helpful.)

      For example, if the system were standard monolithic Linux, the protection domains would be the kernel and the Web server process with its address space. The kernel is privileged, and the server process unprivileged.

    2. Describe the journey of the packet as a sequence of steps through the protection domains identified above. For each protection domain crossing, state the communication mechanism used for that packet to cross protection domains.
    3. Argue which of these systems will likely provide the highest performance Web service without violating protection (e.g., not simply moving the Web server code into the kernel and running it in privileged mode). Justify your argument and be sure to state any assumptions you make.
    4. Further consider the Web server process triggering a page fault on a page in its address space. As with the network packet, trace the propagation of the page fault through protection domains. Which domain handles the page fault? Whose pool of physical memory is used to satisfy the page fault?

      For example, if the system were standard monolithic Linux, the CPU would raise an interrupt, halting the Web server process, and vector to a Linux kernel interrupt handler for page faults. The page fault handler would allocate a physical page from Linux's free physical page list and update the page table entry with the valid mapping. The Linux kernel would then return from the interrupt.

  2. The Levy and Lipman paper on VAX/VMS virtual memory management states that the stack used by the operating system for servicing user process system calls (running in kernel-mode) resides in the user-level address space (the typical practice today is to allocate such a stack in the OS address space):
    "The P1 region [user address space] also contains fixed-sized stacks for use by executive code that executes on behalf of the process." (p. 37)

    This arrangement means that the user-level process has access to the memory region storing stack frames used by the kernel, including local variables with pointers to kernel data structures on the stack as well as return addresses that control where the kernel will execute when returning from a procedure call. Assume such stacks are mapped with read/write access in the user-level address space.

    1. Why do you think they allocated kernel stacks in the user-level portion of the address space?
    2. Why is this arrangement safe (does not violate user/kernel protection) given the process model described in the VAX/VMS paper?
    3. Modern operating systems like Linux allocate kernel stacks in the address space of the OS. Why is it necessary to do so to maintain safety? (Hint: Consider how the process model has changed from VAX/VMS to modern Linux.)

  3. In contrast to blocking locks as provided by semaphores and monitors, a different form of synchronization is known as "non-blocking" or "wait-free" synchronization. With non-blocking synchronization, threads accessing a shared data structure do not block each other. (Linux uses a related kind of synchronization known as read-copy update (RCU) locks, which can have very efficient implementations.)

    The general idea is that a shared data structure is accessed via a single shared pointer. Threads just reading the data structure do not acquire a lock and are not blocked by other readers or writers. They happily use the value of that pointer to read data from the data structure. Threads that write (update) the data structure follow these steps:

    1. Allocate a separate, private instance of the data structure.
    2. Copy the old, shared version of the data structure into the new, private instance (copying just requires reads from the shared version).
    3. Update (write to) the new, private instance as needed.
    4. Atomically change the pointer referencing the data structure to point to the new instance instead of the old. Any previous reader threads referencing the old data structure continue to read from the old version unharmed. Any new reader threads will use the new instance of the data structure and see the updates from the writer.

      This handles the case of a single writer. If there can be multiple writers, then the last step needs to be modified: When the writing thread goes to update the shared pointer, it checks to see if the data structure has been modified since it made a copy (e.g., because a second writer changed the value of the shared pointer before this one was able to); if it has, abort and go back to step 2.

    Assuming these semantics, answer the following questions with a brief explanation.

    1. What kinds of thread workloads (mix of reader and writer threads) work particularly well with this kind of synchronization?
    2. Does this kind of synchronization work equally well for all kinds of data structures, or does it work better for some than others?
    3. Can readers starve other readers?
    4. Can writers starve readers?
    5. Can readers starve writers?
    6. Can writers starve writers?
    7. Can deadlock occur?