Jamison Collins (indecent_and_obscene@yahoo.com)
Thu, 18 May 2000 01:19:46 -0700 (PDT)

Active Messages

Existing message passing primitives fail to achieve
the necessary high performance. This is attributable
to two factors--either excessive cost due to buffer
management(in the case of asynchronous messages) or
possibly stalled processors (in the case of
synchronous messaging). The active message approach
avoids these pitfalls by requiring the embedding of
the address of a message handling routine in the
header of each message. In this way, we are able to
overlap computation and communication because the
receiving process does not have to perform a
synchronous receive, and buffering is eliminated
because the message is handled immediately upon
receipt via the handling routine.
This paper seeks to achieve two primary
goals--demonstrate that this approach is amenable to
existing hardware, and show the taht performance gain
from using active messages is high. The authors also
introduce the Split-C programming model to take
advantage of the features of active messages in
programs. The performance of active messages is
evaluated through a series of microbenchmarks, of
which the result is an order of magnitude reduction in
overhead from message passing.
This was a very interesting paper which seems to have
achieved its goal dramatically. One problem, however,
that I see with this research is that it is very
unsafe, but adding any additional layers to the
communication primitives in order to achieve safety
would increase the message passing overhead.

Lightweight Remote Procedure Call
Despite its name, RPCs generally occur between
protection domains on the same machine, and with
simple parameters. However, the overhead to perform
this kind of RPC is very high. On possible solution
to this is to group unrelated OS subsystems into the
same protection domain in order to reduce the cost of
RPCs between them, but this reduces the safety of the
OS. The LRPC attempts to improve the performance of
this common case through a variety optimizations. For
example, arguments and return values are transferred
without the kernel having to copy memory by having
both caller and callee make use of a shared argument
stack. Many other optimizations were also
The protection methods are also greatly changed from
the standard RPC case in order to optimize protection
checking while still maintaining a sufficient level of
The goal of the LRPC is to achieve the fastest
possible time for the common case of RPC call, while
still preserving all protections. In the end, quite a
large performance improvement occured, with the cost
of an LRPC reduced by a factor of 3. This was
verified with microbenchmarks.
This paper makes a strong statement regarding
ahmdahl's law--the common case must not be ignored.
Again, I liked this paper, but I like almost anything
dealing with performance :P.


Do You Yahoo!?
Send instant messages & get email alerts with Yahoo! Messenger.