CSE 120: Homework #2 Solutions

Fall 2020

  1. The Intel x86 instruction set architecture provides an atomic instruction called XCHG for implementing synchronization primitives. Semantically, XCHG works as follows (although keep in mind it is executed atomically by the CPU):
    void XCHG (bool *X, bool *Y) {
      bool tmp = *X;
      *X = *Y;
      *Y = tmp;
    }
    

    Show how XCHG can be used instead of test-and-set to implement the acquire() and release() functions of the spinlock data structure described in the "Synchronization" lecture.

    struct lock {
        bool isAvailable = TRUE;
    }
    
    void acquire (struct lock* p_lock) {
        bool isAvailable = FALSE;
        while (!isAvailable) {
            XCHG (&(p_lock->isAvailable), &isAvailable);
        }
    }
    
    void release (struct lock* p_lock) {
        p_lock->isAvailable = TRUE;
    }
    

  2. One of the goals of this question is to give you practice with context switching and thread queue manipulation in Nachos. Consider the following test program for an implementation of KThread.join in Nachos. It begins when the main Nachos thread calls KThread.selfTest. You do not need to know the details of how join is implemented. All you need to know is that when a parent thread calls join on a child thread, the parent does one of two things: (1) if the child is still running, the parent blocks until the child finishes (at which point the parent is placed on the ready queue); (2) if the child has finished, the parent continues to execute without blocking. Assume join uses a wait queue of some kind in its implementation.
    private static class A implements Runnable {
        A () {}
        public void run () {
            KThread t2 = new KThread (new B()).setName ("B");
    	System.out.println ("foo");
    	t2.fork ();
    	System.out.println ("far");
    	t2.join ();
    	System.out.println ("fum");
        }
    }
        
    private static class B implements Runnable {
        B () {}
        public void run () {
            System.out.println ("fie");
        }
    }
    
    public static void selfTest() {
        KThread t1 = new KThread (new A()).setName ("A");
        System.out.println ("fee");
        t1.fork ();
        System.out.println ("foe");
        t1.join ();
        System.out.println ("fun");
    }
    

    Assume that the scheduler runs threads in FIFO order with non-preemptive scheduling (no preemptive time-slicing), and threads are placed on wait queues in FIFO order. Trace the execution of this program until it returns from selfTest and (a) write the sequence of context switches that occurred up this point, (b) write the output of the program, and (c) list the queues that the threads are on, and their relative order if more than one thread is on a queue.

        a. Context switches: main → A → B → A → main
        b. Output: fee foe foo far fie fum fun
        c. Thread queues when selfTest returns:
              currentThread: main
              readyQueue: (nothing, A & B have finished)
              join wait queue: (nothing, A & B have finished)

  3. A common pattern in parallel scientific programs is to have a set of threads do a computation in a sequence of phases. In each phase i, all threads must finish phase i before any thread starts computing phase i+1. One way to accomplish this is with barrier synchronization. At the end of each phase, each thread executes Barrier::Done(n), where n is the number of threads in the computation. A call to Barrier::Done blocks until all of the n threads have called Barrier::Done. Then, all threads proceed. You may assume that the process allocates a new Barrier for each iteration, and that all threads of the program will call Done with the same value.

    a. Write a monitor that implements Barrier using Mesa semantics.

    monitor Barrier {
      int called = 0;
      Condition barrier;
    
      void Done (int needed) {
        called++;
        if (called == needed) {
          called = 0;
          barrier.wakeAll();
        } else {
          barrier.sleep();
        }
      }
    }
    

    b. Implement Barrier using an explicit lock and condition variable. The lock and condition variable have the semantics described at the end of the "Semaphore and Monitor" lecture in the ping_pong example, and as implemented by you in Project 1.

    class Barrier {
      int called = 0;
      Lock lock;
      Condition barrier;
    
      Barrier () {
        lock = new Lock();
        barrier = new Condition(lock);
      }
    
      void Done (int needed) {
        lock.acquire();
    
        called++;
        if (called == needed) {
          called = 0;
          barrier.wakeAll();
        } else {
          barrier.sleep();
        }
    
        lock.release();
      }
    }
    

  4. Microsoft .NET provides a synchronization primitive called a CountdownEvent. Programs use CountdownEvent to synchronize on the completion of many threads (similar to CountDownLatch in Java). A CountdownEvent is initialized with a count, and a CountdownEvent can be in two states, nonsignalled and signalled. Threads use a CountdownEvent in the nonsignalled state to Wait (block) until the internal count reaches zero. When the internal count of a CountdownEvent reaches zero, the CountdownEvent transitions to the signalled state and wakes up (unblocks) all waiting threads. Once a CountdownEvent has transitioned from nonsignalled to signalled, the CountdownEvent remains in the signalled state. In the nonsignalled state, at any time a thread may call the Decrement operation to decrease the count and Increment to increase the count. In the signalled state, Wait, Decrement, and Increment have no effect and return immediately.

    Use pseudo-code to implement a thread-safe CountdownEvent using locks and condition variables by implementing the following methods:

    class CountdownEvent {
      ...private variables...
      CountdownEvent (int count) { ... }
      void Increment () { ... }
      void Decrement () { ... }
      void Wait () { ... }
    }
    

    Notes:

    
    class CountdownEvent {
      int counter;
      bool signalled;
      Lock lock;
      Condition cv;
    
      CountdownEvent (int count) {
        counter = count;
        if (counter > 0) {
          signalled = false;
        } else {
          signalled = true;
        }
        lock = new Lock();
        cv = new Condition(lock);
      }
    
      void Increment () {
        lock.acquire();
        if (signalled == false) {
          counter++;
        } // otherwise do nothing if already signalled
        lock.release();
      }
    
      void Decrement () {
        lock.acquire();
        if (signalled == false) {
          counter--;
          if (counter == 0) {
            signalled = true;
            cv.wakeAll();
          }
        } // otherwise do nothing if already signalled
        lock.release();
      }
    
      void Wait () {
        lock.acquire();
        if (signalled == false) {
          cv.sleep();
        } // otherwise do nothing if already signalled
        lock.release();
      }
    }
    
    

  5. [Silberschatz]   Consider a system running ten I/O-bound tasks and one CPU-bound task. Assume that the I/O-bound tasks issue an I/O operation once for every millisecond of CPU computing and that each I/O operation takes 10 milliseconds to complete. Also assume that the context-switching overhead is 0.1 millisecond and that all processes are long-running tasks. What is the CPU utilization for a round-robin scheduler when:

    1. The time quantum is 1 millisecond

      In one iteration of the steady state behavior, a round-robin scheduler will run the CPU-bound task once for 1 ms, and each I/O-bound task once for 1ms each. There will be 11 context switches with 0.1 ms overhead each, so the total time taken for one iteration is 1 + (10 * 1) + (11 * 0.1) = 12.1 ms. The CPU was actually doing useful work for 11 ms, so the CPU utilization is 11 / 12.1 = 91%.

    2. The time quantum is 10 milliseconds

      In one iteration of the steady state behavior, a round-robin scheduler will run the CPU-bound task once for 10 ms, and each I/O-bound task once for 1ms each. There will be 11 context switches with 0.1 ms overhead each, so the total time taken for one iteration is 10 + (10 * 1) + (11 * 0.1) = 21.1 ms. The CPU was actually doing useful work for 20 ms, so the CPU utilization is 20 / 21.1 = 95%.

  6. Annabelle, Bertrand, Chloe and Dag are working on their term papers in CSE 120, which is a 10,000 word essay on My All-Time Favorite Race Conditions. To help them work on their papers, they have one dictionary, two copies of Roget's Thesaurus, and two coffee cups.

    Consider the following state:

    Is the system deadlocked in this state? Explain using a resouce allocation graph.

    The resource allocation graph, shown below, can be fully reduced. Bertrand is not blocked. Erasing the edges incident on Bertrand unblocks Chloe and Dag. Erasing their edges unblocks Annabelle.