 |
 |
Planguages
The Planguages: an Approach to Parallel Technical Computing
http://www.hpc.uh.edu/planguages/
Group of L. Ridgway Scott
A. Overview
The Planguages are parallel programming languages which extend Fortran and C for scientific and engineering applications. The aim is to
- accelerate parallel software development,
- improve the readability of parallel programs, and
- achieve performance with independence from platform- and network-specific details.
The model consists of explicit parallelism with distributed memory.
The Planguages provide concise mathematical expressions for accessing
off-process data in explicitly parallel codes. The translators for Pfortran
and PC compile the notation into Fortran77 and C along with calls to the
target data movement system, such as MPI and PVM. In this way code is
portable and many low-level, error-prone details, such as managing
message-passing tags, are automatically handled by the translators.
To port Planguage codes often amounts to little more than a recompilation
with the Pfortran and PC translators, followed by a compilation with the
native Fortran and C compilers. Legacy codes are not problematic for
the Planguages due to the low impact modification required.
The supersets of Fortran77 and C comprising Pfortan and PC have additional
semantics which have been carefully chosen, making for a slim but expressive
set of functionality. The additional operators are: @, {}. For example,
a=b@q assigns b at process q to a at each local process. The global
reduction a=+{b} sums b across all processes, assigning the result to
a at all processes, whereas, a=MIN{b} assigns the minimum value of
b at all processes to a; and a=F{b} uses some user-defined function
or subroutine. In this way, conglomerates of send and receive subroutine
calls, taking MPI as an example, can be concisely expressed as a statement
that is closer to the logical intent of the program.
Several production computational chemistry programs parallelized with
Pfortran are now in production use on massively parallel computers. These
are described in section C.
B. COMPONENTS
The translators for Pfortran and PC, pfc and pcc, have undergone a
period of hardening during 1998. The base language definition described
in the Pfortran Reference Manual (available from the Planguage web site)
is fully implemented; the translators are robust and have been aggressively
tested.
B.1. The Language and Translators
The Planguages take a minimal approach to extending the sequential
languages they are based on. This has resulted in a well-thought
out set of primitives. The project continues in this vein, weighing
new syntax and functionality carefully. Currently, all language
features are supported in the translators.
We are currently considering language extensions in parsimonious
spirit. We have encountered applications in image processing and in
computational chemistry where a mixed memory model is useful, that is,
a model supporting non-replicated shared memory and distributed memory.
(The current Planguage model is distributed memory.) At the moment we
are considering syntactic avenues for expressing shared-memory data
structures, while exploring implementation issues of the same.
There are some Planguage features which can benefit from data- and
control-flow analysis of the program. For example, the Planguages
permit the use of the locally defined variable myProc, the logical
process identifier, as source and destinations in off-process data
access statements. At present, the translators require that the
variable text "myProc" be explicit on the @-sign line. For example,
a@myProc = b@f(myProc)
is supported, whereas the following is not:
p = myProc
a@p = b@f(myProc)
A bigger picture emerges such that variables that are not uniformly defined
across processes cannot be used in the control flow where interprocess data
movement statements are involved (uniform variables necessarily have the same
value at all processes).
It is unusual that the condition described above limits the expression
of algorithms, however, erroneous programs can arise due to
programmers simply overlooking the uniformity requirement. On the other
hand, for algorithms with a task parallel flavor to them, the ability to
conduct sound communication among processes executing in portions of code
not executed by all is sometimes useful (a scenario where non-uniformity
can arise).
Code generation optimizations and more liberal placement of data-movement
expressions can be accomplished with data- and control-flow analysis. We
consider this the next major develop phase of the translators and consider
it as a stage 5, technology transfer milestone (see section F).
In summary, the Planguages consist of a functional set of features which
has been found quite adequate for expressing the parallelism in a number of
production parallel codes. These have been implemented by the present
translators with the caveats above regarding variable uniformity. The
project is investigating additional language features which are likely
to be useful in parallel code development: non-replicated shared-memory
data structures is one area requiring language support; another is input
and output. We are are actively researching both of these.
B.2. The Runtime Libraries and Data Movement Algorithms
The translation process used by the Planguage translators proceeds by
processing Planguage source code that specifies off-process data accesses,
then the translators generate inline data-movement algorithms, which
can range from inline generation of the full data-exchange algorithm, to
a single function call to a library. To a large extent the data-movement
algorithm can be developed as a kernel performing some exchange possibly
along with some combine, for example, a reduction algorithm. The algorithms
currently generated are robust, general algorithms. In cases where the user
would like to implement specialized exchange algorithms, the Planguage
API to the runtime library can be used, but this is generally discouraged.
Thus, where the Planguage primitives are not adequate to express the
algorithm (unusual, but possible) it is straightforward to escape from
the Planguage notation, but with a potential portability penalty.
To take the runtime library into the next stage (see section F), we
are in the process of increasing the breadth of algorithms generated
by the Planguages to perform collective data movement and combines.
C. Selected Applications Developed with Pfortran and PC
C.1. QUANTUM CLASSICAL MOLECULAR DYNAMICS
Chemical reactions involving bond formation and breaking
are outside the purview of classical molecular dynamics simulations.
Yet, to model all atoms quantum mechanically in large, biochemical
systems is computationally prohibitive. The Quantum Classical
Molecular Dynamics Code addresses this problem by treating a part of the
modeled system quantum mechanically and the rest using classical molecular
dynamics. The principles behind the QCMD code and an overview of the
parallelization strategy using Pfortran are discussed in the references
cited below.
REFERENCES
P. Bala, P. Grochowski, B. Lesyng, and J. A. McCammon, "Quantum-Classical
Molecular Dynamics and Its Computer Implementation," Computers & Chemistry,
1995.
P. Bala, T. Clark, P. Grochowski, B. Lesyng, K. Nowinski and J. A. McCammon,
"Advanced simulations and visualization of enzymatic reactions using a
combined Quantum-Classical Molecular Dynamics code," Proceedings of
the Applied Parallel Computing, 4th International Workshop, PARA'98, 1998,
in Recent Advances in Parallel Virtual Machine and Message Passing Interface,
Lecture Notes in Computer Science, volume 1541, edited by
B. Kaagstrom, J. Dongarra, E. Elmroth and J. Waniewski, pages 409-416,
Springer-Verlag Berlin.
C.2. MOLECULAR DYNAMICS
EulerGROMOS, the spatial decomposition of the molecular dynamics program
GROMOS, was developed using Pfortran. EulerGROMOS was released in Spring
1994. Since, the program has been used in the simulation of the
acetylcholinesterase dimer in water with approximately 130,000 atoms,
which at the time represented the largest ever molecular dynamics simulation
of a biological system. The computational chemistry work and computer
science aspects of the project are summarized in following references.
REFERENCES
Terry Clark, Reinhard v. Hanxleden, J. Andrew McCammon and L. Ridgway Scott,
"Parallelization using decomposition for Molecular Dynamics,"
Proceedings of the Scalable High-Performance Computing Conference,
Knoxville, Tennessee, May, 1994, pages 95-102, published by the
IEEE Computer Society Press.
Stanislaw Wlodek, Terry Clark, L. Ridgway Scott, and J. Andrew McCammon,
"Molecular Dynamics of Acetylcholinesterase Dimer Complexed with Tacrine,"
The Journal of the American Chemistry Society, volume 119, pages 9513-9522,
1997.
C.3. BROWNIAN DYNAMICS
Reaction kinetics of diffusing substrates with enzymes can be
modeled with a combination of Brownian dynamics and electrostatics.
The electrostatics typically solve Poisson's equation for the
molecular assembly's charge distribution, with solvent modeled
implicitly (solvent is usually water). The University of Houston
Brownian Dynamics program (UHBD) takes this approach. UHBD
was developed by the J. Andrew McCammon group. This program was
subsequently parallelized for distributed memory computers using Pfortran,
and for the Kendall Square Research KSR1 using compiler directives.
REFERENCES
B. Bagheri, A. Ilin and L. R. Scott, "Parallelizing UHBD for the iPSC-860,"
Proceedings of the Intel Supercomputer Users' Group 1993 Annual Users'
Conference, pages 295-299, St. Louis, MO, October, 1993.
B. Bagheri, A. Ilin and L. R. Scott, "A Comparison of Shared and Distributed
Memory Scalable Parallel Processors: 1. KSR Shared Memory," in the
Proceedings of the Scalable High-Performance Computing Conference,
pages 9-16, May, 1994. Knoxville, Tennessee, published by
IEEE Computer Society Press.
D. Sites where Planguage Translators are Installed
The Planguage translators are installed at
- Copernicus University, Torun, Poland
- Department of Computer Science, University of Chicago
- High Performance Computing Center, University of Houston
- SDSC, University of California, San Diego
- Wright Patterson Air Force Base, Material Science Laboratory, Dayton, Ohio
E. Documentation
Several documents are available for Pfortran and PC:
- The PC Reference Manual (14 pages)
- The Pfortran Reference Manual (94 pages)
- The Pfortran Users Guide (23 pages)
These can be found on the web site at http://www.hpc.uh.edu/planguages/ and with the Planguage distribution.
In addition, Scott, Clark and Bagheri are completing a book on
parallel computing which uses the Planguage model for algorithm
development. The book is going to publishers in 1999.
F. Proposed Production Stages
Given the NPACI stage definitions, it appears the Planguage
project is in stage 2, and moving into stage 3.
Stage 1: See installation sites and applications above.
Stage 2: Early Deployment
- This is the currently established stage.
Stage 3: Pre-production
- Milestones indicative of this stage:
- a. Feed back from NPACI installation and users.
- b. Further tune communication algorithms for NPACI platforms
(section B.2).
Stage 4: Production
- Milestone indicative of this stage: further distribution.
Stage 5: Technology Transfer
- Milestones to reach this stage are the following
functionality.
- a. Translators with data- and control-flow analysis.
- b. Local & global variable analysis, or uniformity (section B.1).
- c. Incorporate non-replicated shared-memory (section B.1).
- d. Basic I/O support.
- e. Subgroup support (section B.1).
Last modified: Wed Jan 20 16:37:48 1999
|