Ron Brightwell, Sandia
Coordinating PI
Thomas Sterling, IU
Chief Scientist
Kelsey Shephard, IU
Project Manager
XPRESS
Software
APEX
- We have developed a software of the APEX prototype with
initial versions of OpenX. It is engineered to work with both HPX-3
and HPX-5, and integrates RCR Toolkit and TAU. Both RCR Toolkit and
TAU were updated to work with APEX and support certain capabilities
that APEX required. APEX has been ported to several platforms,
including Edison system at NERSC. APEX is included in the XPRESS
software releases. Ongoing work with APEX will interface with XPI,
enhance LXK introspection, and implement more sophisticated policies,
especially for multi-objective control. APEX is currently being
evaluated with XPRESS project applications.
HPX-3
- HPX-3 is a general purpose C++ runtime system for
parallel and distributed applications of any scale. The most recent
version can be downloaded here.
HPX-5
- The HPX-5 runtime system directly reflects the semantic
constructs of the underlying ParalleX execution model, being
comprised of an active global address space, lightweight user thread
management, parcels and message-driven computation, distributed
processes, local control objects, and continuations. HPX-5 has
focused on understanding low-level details of efficient and scalable
exascale runtime implementation, including interfaces and
interactions with operating systems and with hardware
architecture. The initial implementation of HPX-5 relied on a
two-sided, parcel-based model of communication. However, a fast,
low-latency, one-sided communication mode is the current
default. HPX-5 supports dynamic, distributed processes with
determination detection and a flat, byte-addressable, global address
space using native RDMA transport. Our implementation also supports
a software based active global address space (AGAS) that allows
explicit movement of blocks within the global address space.
HPX-5 can be downloaded here.
Kitten Lightweight Kernel
- The Kitten lightweight kernel1 is the
basis for the development of the Lightweight eXascale Kernel (LXK) in
the XPRESS project. Kitten continues to be actively developed as part
of the XPRESS and Hobbes projects. As a research prototype, no formal
releases of Kitten are made. The latest Kitten source can be
obtained from Github here.
Open MPI
- University of Houston's research team are long-time
contributors to the Open MPI software project. For the XPRESS project,
they have added support to use HPX as a runtime layer for Open MPI. It
currently uses the HPX runtime environment for startup and
communication through the TCP/IP transport module, although some
details with respect to the integration of memory allocators and the
internal utilization of threads in Open MPI still have to be worked
out. Support for other network transports, specifically InfiniBand and
shared memory, are currently being actively worked on. Open MPI is
also able to use the native SLURM startup through HPX. Open MPI can
be dowloaded here.
OMPTX
- The University of Houston team developed OMPTX, an
OpenMP runtime based on HPX which shares the same ABI as the newly
open-sourced Intel OpenMP runtime. This runtime has been adopted in
existing compilers from Intel, GNU, PathScale, and Clang, and support
within OpenUH is underway. The ability to use OMPTX as a drop-in
replacement for the Intel OpenMP runtime facilitates experimentation
in running OpenMP programs with HPX. The Intel OpenMP Runtime
interface and a number of the compilers which use it support OpenMP
4.0, allowing us to experiment with implementation of newer OpenMP
features within the proposed software infrastructure.
Profugus
- The Profugus mini-application (mini-app)2 is a set of
computational kernels extracted from the Exnihilo particle transport
code package at Oak Ridge National Laboratory (ORNL)3. Exnihilo
provides advanced transport solvers and associated front-ends that are
applicable to a wide-range of nuclear technology applications
including reactor neutronics, shielding, criticality safety,
dosimetry, fusion component analysis, and others. The resulting
problem-specific complexity in the various front-ends, the dependence
on a wide range of third-party libraries (TPL), and, most imporantly,
export control makes Exnihilo a difficult platform to share among
computer scientists and hardware engineers. To overcome these
limitations and to provide a means to collaborate across fields, we
have developed the Profugus mini-app. The goal of Profugus is to
provide open-source kernels that effectively capture the algorithmic
features of the full Exnihilo applications. In this manner more
effective connections can be established between applications
developers and Xpress researchers.
TAUdb
- We use the TAUdb performance database to store
performance results from experiments. However, the databases are
primarily for internal use only. TAUdb is part of the TAU project
that can be downloaded from here.