Future Directions of KeLP

Here is a look at the future of KeLP, and a few projects that might inspire student projects. Let us know if you are interested.

    Data-dependent communication. In traditional KeLP applications, each process can infer its communication requirements by looking at a local table of meta-data.  In some cases, however, the meta data alone doesn't tell the full story. The data residing  off-processor are needed to determine communication requirements. This occurs for example in particle methods, or moving boundary methods for adaptive mesh refinement.  We are developing a plug-in for Abstract KeLP to treat such cases.  If you are interested in this capability, let us know.

    Multi-tier programming . Multi-tier computers are hierarchically constructed multicomputers, with multiple levels of parallelism and locality.  An interesting variety is a dual-tier design, which employs  symmetric multiprocessor nodes. Such designs include SMP clusters, the various ASCI machines, and the NPACI Tflops machines.  These machines are a challenge to program; in some cases, an effective implementation must  rely on hybrid mixed-mode programming of MPI and either threads or OpenMP.  In some cases a "flat" MPI model may not deliver adequate performance.  An  additional challenge is to tolerate the high cost of communication, which  is amplified by multiprocessing at the node. We are currently experimenting with a prototype of KeLP, called KeLP2, which supports two levels of parallelism and data motion. KeLP2 can runs a proxy on a spare SMP processor to overlap communication with computation.  (MPI implementations tend not to support true overlap using asynchronous, non-blocking communication.)  For more information about this experimental version of KeLP see the URL http://www-cse.ucsd.edu/groups/hpcl/scg/Research/MT.html.   KeLP2 was originally written by former student Stephen J. Fink (PhD '98). KeLP2 has been ported to NPACI Blue Horizon and ASCI Blue Pacific.  Current work with  KeLP2  involves  Gregory Balls (UCSD/SDSC) and Daniel Shalit.

    Memory locality optimization. In KeLP2, we express parallelism within an SMP node using controlled operations over subspaces of the computational domain assigned to a node. In fact, this model may also be used to manage cache locality, e.g. strip mining or tiling.  We are currently examining techniques for extending the two-level KeLP2 model, to express additional levels of latent parallelism, that can be used to manage cache locality. This work involves Gregory Balls (UCSD/SDSC) and Paul Kelly at Imperial College, London, UK.

    An MPI-less KeLP. Decoupling KeLP from MPI will enable certain optimizations on architectures supporting single-sided communication or shared memory. In addition, an MPI-less KeLP would enable it to be installed on single processor workstations where MPI hasn't been installed. There is a potential here for a student project.

    A checkpointer for KeLP. A checkpointer  saves the  state of a running program to stable storage, allowing the program to be later restarted  in  case of a system crash or other failure. Checkpointing is a difficult task in general, and becomes even more challengin on  the multicomputers and clusters that KeLP applications usually run on.  The availability of global knowledge about communication under KeLP enables the level of program introspection necessary to construct an efficient checkpointer for KeLP applications. The KeLP checkpointer uses XML as its data description language, rendering the checkpoint files machine usable by other programs, e.g. a visualization tool. The KeLP checkpointer is currently under development by M.S. student F. David Sacerdoti (UCSD/CSE) and should be available in a  future release of KeLP1.x.  This work involves Phil Papadoulos' group at the San Diego Supercomputer Center, and Bill Gropp and Rusty Lusk at Argonne National Laboratory.

    Program coupling under Garlik. KeLP provides a simple model of managing communication among structured blocks of data.  We are using KeLP's communication model to devise a new programming paradigm that enables to separate applications to be coupled together in a multi-physics simulation.  This new paradigm will be implemented as a library that defines an interface to the KeLP communication model, but without requiring the applications to be written in KeLP (they could be written in MPI).

    Hetergeneous data distribution. Computing environments often contain a mix of resources, with varying capabilities. For example,  clusters employ nodes with different numbers of processors,   the nodes  may have been purchased at different times, and hence run at different speeds. We are investigating techniques athat enable a KeLP application to treat a heterogeneous machine  almost as if it were homogeneous.   This work was done by graduate student Sean Peiser (MS, 2000).


Last updated by  Scott B. Baden on 02/23/01 10:25 PM