Paper Evaluations: 05-17-2000

Flavio P JUNQUEIRA (flavio@cs.ucsd.edu)
Wed, 17 May 2000 22:08:34 -0700 (PDT)

Title: "Lightweight Remote Procedure Calls"

This paper proposes a design for remote procedure calls optimized for
the communication between domains in the same machine. The
optimizations are based on experiments of systems which show that most
calls are to the same node. Moreover, actual parameters are often
small and have a fixed sized, indicating that a simple byte copying is
sufficient in most cases. The performance of cross-domain RPC in some
systems is poor due to the following overheads: stub overhead, message
buffer overhead, access validation in the call and return, message
transfer, thread scheduling, context switches, and thread dispatch in
the server domain.

LRPC improves the performance of calls either by eliminating or by
reducing each of those overheads. Before calling procedures exported
by a server, the client binds to this interface by way of a kernel
trap. The kernel is responsible for allocating argument stacks
(A-stacks) that are mapped in both client and server domains. A-stacks
are used by client stubs to pass arguments and also to receive
results. Since the A-stacks are also mapped in the server domain, the
arguments are only copied to the A-stack. In the server domain, it is
not necessary to dispatch a new thread, because the client thread is
dispatched to execute in the server domain. It is required to reload,
however, the virtual memory registers with those of the server domain.
With this approach, the number of context switches decrease, and also
it is not required neither to schedule other threads nor to dispatch a
new thread in the server domain. For access validation, the kernel
provides a binding object, which is used in every call to a interface
procedure. To return the results, there is no need to verify the
returning thread right to transfer back to the calling domain, since
it was granted at call time.

In my opinion, this research work is very good. The authors present a
motivation using experiments, a design for LRPCs based on the previous
results, and then a performance evaluation. An important feature of a
system using LPRC mechanism is that it have the same communication
abstraction for cross-domain and cross-machine procedure calls, and
provides optimizations for the local case at the same time. It would
be interesting to observe the impact of A-stack allocation on the
system, since it is not clear the amount of memory that would be used
for this purpose.

----------------------------------------------------------------------------

Title: "Active Messages: a Mechanism for Integrated Communication and
Computation"

Active messages provides a mechanism for efficient communication in
large-scale multiprocessors. Every active message carries the address
of a user-level handler which is executed in the destination in order
to extract the message from the network interface and insert the
received data in the ongoing computation. When a message is received,
the network interface generates an interrupt, calling the appropriate
handler.

The basic idea of the mechanism is to permit overlapping of
communication and computation. Hence, messages are received
asynchronously with respect to the computation in the processor.
Furthermore, active messages do not require neither buffering, nor
send/receive blocking operations, as implemented in some
systems. Because the handlers interact directly with the network
interface, the overhead imposed by the operating system is
avoided. Handlers must execute fast, only inserting the data received
in the ongoing computations and not executing arbitrary computations.

The efficient communication mechanism provided by active messages is
important for parallel applications. It improves the performance of
applications requiring intensive communication between several tasks
executing in different nodes of a multiprocessor. This mechanism,
however, is not intended for arbitrary computer communication. It can
be used also in local area networks to build clusters, but it was
not designed for communication in wide-area networks.

--Flavio Junqueira

------------------------------------------------------------
Flavio Paiva Junqueira, PhD Student
Department of Computer Science and Engineering
University of California, San Diego

Office location: AP&M 4349 e-mail: flavio@cs.ucsd.edu
------------------------------------------------------------