Evaluations: 4/20

Bianca Zadrozny (zadrozny@cs.ucsd.edu)
Thu, 20 Apr 2000 02:36:23 -0700

StarOS, a Multiprocessor Operating System for the Support of Task Forces

This paper describes StarOS, an object-oriented operating system that exploits the potentials of a multiprocessor architecture, the Cm* multiprocessor computer. The notion of a task force is introduced as a large set of small processes that cooperate to execute a task, maximizing usage of available parallelism. The main form of interprocess communication is message passing, which allows processes to have freedom to decide when to wait for a message, thereby enhancing the possibilities for concurrency.

The simpler functions of StarOS are implemented in a nucleus that is duplicated in every computer module. Each of these functions is synchronous, i.e., the caller is blocked waiting for it to finish. The more sophisticated components, such as the file system, are implemented as task forces and provide functions that are asynchronous. Although it would be possible to create a task force that automatically grows (or shrinks) its number of processes according to the workload, the current implementation of the OS does not provide its components as such.

In StarOS, all the access to information is done through objects. For example, access to memory is accomplished by mapping a "basic" object (which has the functions read and write) to memory. Capabilities are used to protect access to objects. One interesting characteristic of StarOS is that even the nucleus cannot access an object for which it does not have a capability. This is a strict enforcement of the "need-to-know" principle.

In my point of view, there are two problems with the design proposed in the paper. First, the Kmap has "first refusal" of each request to execute a StarOS intruction. This means that all the requests to execute StarOS instructions have to be send first to the Kmap of the cluster, even if the computer module has the code to execute the instruction. Thus, the Kmap is a point of contention in the system. Second, the system does not require that each process execute only local code. Since code is the most frequent type of information read from memory, and accessing other processor's memory is inefficient, this has the potential of making the system very slow.


Medusa: An Experiment in Distributed Operating System Structure

This paper describes the design of Medusa, a distributed operating system for the Cm* multiprocessor that tries to achieve modularity, performance and robustness. Modularity is achieved by structuring the system components as task forces, which are composed of a large number of small activities that may execute concurrently.

Since the Cm* multiprocessor is composed of a large number of small processors, a large number of concurrent small activities can be executed very efficiently. Moreover, Medusa uses scheduling algorithms that promote the coscheduling of activities of the same task force, improving the performance of the system. Also, processors are restricted to only execute local code. This reduces significantly the number of non-local memory accesses, which are slow. There were other design decisions that attempt to improve the performance of the system, but in some cases reduce its flexibility or elegance.

Robustness is achieved by replicating activities in the system utilities. Each utility dynamically grows (or shrinks) its number of activities according to the workload. But at least two duplicate activities are guaranteed to exist for each utility. Furthermore, deadlocks are avoided by dividing the utilities into service classes and not allowing circularities in the dependencies between service classes. Each service class is given its own set of resources.

In my opinion, this paper is very well written. The set of goals for the system is presented in the introduction and each design decision is explained from the point of view of these goals. What I liked most in the paper was the idea of making the system adapt itself according to its workload.