V & Sprite Evaluations.

Thu, 4 May 2000 04:06:02 EDT

The Distributed V Kernel and its Performance for Diskless Workstations

V kernel distributed operating system is developed to support a uniform local and remote interprocess communication. Each workstation is diskless with all the necessary storage provided by backend file servers all connected using high-speed local network links. In general, diskless workstations are viewed as an overhead due to the excess time the application requires till transferring the data needed. However, diskless workstations have the advantages of little overhead on the workstation itself for file system and disk handling. Furthermore, the general replication proposed in other distributed operating systems such as LOCUS and the consistency assurance problems all vanish in a diskless environment. Thus, a predominant use of the message facility is to support file access for the diskless workstations.
As for most distributed operating systems, a process is uniquely identified over the network. This allowed an efficient mapping from process id to network address. Communication primitives are provided so that each process can communicate with the file server. Some of those primitives, such as RecieveWithSegment and ReplyWithSegment optimize the need of extra messages to be transferred, and thus gives a better performance. The amount of information transmitted to get the best performance was found to be of the same size of the packet. The reason behind this is that when bigger chunks of information get transferred, more packets are needed and thus more time is wasted to construct them.
As for many concerns to make the system scalable, alien processes are introduced to handle the send requests received from other processes. A reply-pending makes the originating process stop retransmissions and wait till the reply is received back. Incase an error is introduced in the message, the retransmission process continuous from the last correctly received data packet. This allows to: give a better performance, saves bandwidth of transmission link and allows more workstations to be connected to the file server.
The performance of the V kernel is obtained by comparing the cost of executing the process with the storage that it needs being locally OR remotely. Since when the storage needed is stored in a different machine (namely file server), a lower bound on the cost of communication is used. The network penalty depends on the speed of the processor, the network, the network interface and the amount of bytes being transmitted.
In general, the paper explained the reasons that led the designers to choose an implementation over the other. I couldn’t follow all the discussion in the performance, mainly because I found it hard to see it as a good result.

The Sprite Network Operating System

Sprite is a high performance, transparent, distributed operating system developed to make use of the advance technology in: networks, memories and multiprocessors. Sprite main facilities that ease resource sharing are: transparent file system, shared address spaces and process migration.
The file system in Sprite is viewed as a single hierarchy as viewed in Unix. Domains represents a sub-tree in the hierarchy and are distributed on different servers. Prefix table is used to retrieve information about a file; which server is the file on. Using caches reduced the file access time and network traffic. As for any distributed system that uses caches, the consistency problem raised and an efficient solution was implemented by disabling clients caching of a concurrent write-sharing file. Still, the solution allows inconsistency periods of 30 seconds before normal updating of file occur.
Process migration was used to reduce the load on workstations that are under heavy load. The migration process is different than ones used in other OSs since in Sprite the old machine pages out the dirty pages and transfers information about them to the new machine. Furthermore, by assigning each process a home node, results that are machine dependents can be sent to the home nod by kernel calls to be executed (thus guarantees the same behavior whether migrated or not).
Sprite kernel supports multiprocessor and network operations by using multithreading and remote procedure calls. Multithread allows having more than one process in the kernel and considered to be used on multiprocessors workstations and gives a better performance instead of its complexity. RPC facility consists of stubs and transports, with each remote call resulting in having one of each in both side, client and server. To make the transfer process faster and get rid of the high overhead from bulk transfers, RPC implements fragmentation at the sender side and reassembly at the receiver side.

The paper in general is great, however, I would like to comment on three different aspects that the authors rose:
1) Client reads pages from server’s cache faster than from a local disk (with the authors believing that network link speed and CPU speed will increase in the future faster than disk speed): A very good observation, solved a bottleneck in systems performance with clever intuition.
2) The idea of process migration: Since this approach is feasible on a trust worthy network only, with each processor trusting the other, I don’t see the need for implementation except for a dedicated isolated network with users who trust each other.
3) Users ability to evict migrated processes on his workstation when he returns: Why do you want to waste much time in choosing which workstation to migrate to, and then give the ability of the user to kick you out!. Timesharing should be permitted then with a reasonable quantum for each one of them.

Sobeeh Almukhaizim