|
-
TOP-C especially distinguishes itself as a package to easily
parallelize
existing sequential applications. An application programmer can
frequently convert a sequential application to a parallel application
in less than a day, with no
prior experience with TOP-C.
TOP-C is both small and self-contained, even including its own MPI
subset. Yet, it can also work well with your debugger, your own MPI,
etc.
-
The
design goals of TOP-C are that
it should follow a simple, easy-to-use programmer's model, and that
it should have high latency tolerance. This allows for economical
parallel computing on commodity hardware, as well as high performance
on the latest supercomputers.
TOP-C hides the details of parallel programming, and presents the
application programmer with a simple task-oriented interface, for which
the application programmer need only define four callback functions.
Yet, this simple model has been shown to readily adapt to a wide
variety
of algorithmic requirements.
-
TOP-C runs on most variants of UNIX/Linux.
The source code is layered, making it easy to modify and easy to
add a new communication module for a new hardware architecture.
Current communication modules include one for distributed
memory using sockets (e.g. networks of workstations), one
for shared memory using POSIX threads, and one for a single CPU (useful
for debugging). The same TOP-C application code runs with any of
the three communication modules.
-
TOP-C has three fundamental concepts:
the task (specified by the master and
executed by
a slave process),
the global
shared data,
and the action
chosen after a
task is completed. Communication between processes occurs only through
these three mechanisms. Yet, a TOP-C application is built around a
single
system call:
TOPC_master_slave().
It takes four arguments consisting
of four application-defined callback procedures:
|
GenerateTaskInput() ->
input
DoTask(input) -> output
CheckTaskResult(input, output) -> action
UpdateSharedData(input, output)
(executed only if UPDATE action returned)
|
Upon
receiving the output of a task, the master then decides on
one of four actions:
|
NO_ACTION
UPDATE (update the shared data)
REDO (in case the result of some
other task has altered the shared data)
CONTINUATION(new_input_for_same_slave) |
A task re-executed due to a REDO action can be executed more quickly,
because it is executed in the same process as the original task.
Hence any information from the previous task computation and previous
update can be saved in a global variable and used to accelerate the
re-computation of the task under the new task input.
This simple parallel model
turns out to be surprisingly adaptable for parallelizing existing
sequential software,
as is demonstrated in the parallelization of a 1,000,000 line C++
program, Geant4
at CERN, for simulation of
particle-matter
interaction, with applications to physics, engineering and
biomedicine. A version of TOP-C (called
ParGAP) has also
been used to parallelize
GAP
(Groups, Algorithms and Programming), and is distributed from
their site.
|