Performance Portability of an Intermediate-Complexity
Atmospheric Research Model in Coarray Fortran
Author/Presenters
Event Type
Workshop
Applications
Effective Application of HPC
Parallel Programming Languages, Libraries, Models
and Notations
Performance
Programming Systems
SIGHPC Workshop
Scientific Computing
TimeMonday, November 13th12pm -
12:30pm
Location702
DescriptionWe present results on the scalability and performance
of an open-source, Coarray Fortran (CAF)
mini-application (mini-app) that solves several
parallel, numerical algorithms known to dominate the
execution of the Intermediate Complexity Atmospheric
Research (ICAR) model developed at the National Center
for Atmospheric Research (NCAR). The solver employs
standard Fortran 2008 features and includes several
Fortran 2008 implementations of the collective
subroutines that are defined in the Committee Draft the
upcoming Fortran 2015 standard. The
ability of CAF to run atop various communication layers and the increasing compiler support for CAF facilitated initial evaluations of several compiler/runtime/hardware combinations. Results are presented for the GNU, Intel, and Cray compilers, each of which offers different parallel runtime libraries employing one or more communication layers, including MPI, OpenSHMEM, and proprietary alternatives. We studied the performance on both multi- and many-core processors running on distributed-memory systems. The results of our initial investigations suggest promising scaling behavior across a range of hardware, compiler, and runtime choices on platforms ranging up to 100,000 cores.
ability of CAF to run atop various communication layers and the increasing compiler support for CAF facilitated initial evaluations of several compiler/runtime/hardware combinations. Results are presented for the GNU, Intel, and Cray compilers, each of which offers different parallel runtime libraries employing one or more communication layers, including MPI, OpenSHMEM, and proprietary alternatives. We studied the performance on both multi- and many-core processors running on distributed-memory systems. The results of our initial investigations suggest promising scaling behavior across a range of hardware, compiler, and runtime choices on platforms ranging up to 100,000 cores.




