Implicit Low-Order Unstructured Finite-Element Multiple
Simulation Enhanced by Dense Computation Using OpenACC
Author/Presenters
Event Type
Workshop
Accelerators
Applications
Compiler Analysis and Optimization
Compilers
Parallel Programming Languages, Libraries, Models
and Notations
Runtime Systems
TimeMonday, November 13th11:30am -
12pm
Location712
DescriptionIn this paper, we develop a low-order three-dimensional
finite-element solver for fast multiple-case crust
deformation analysis on GPU-based systems. Based on a
high-performance solver designed for massively parallel
CPU based systems, we modify the algorithm to reduce
random data access, and then insert OpenACC directives.
The developed solver on ten Reedbush-Hnodes (20 P100
GPUs) attained speedup of 14.2 times from 20 K computer
nodes, which is high considering the peak memory
bandwidth ratio of 11.4 between the two systems. On the
newest Volta generation V100 GPUs, the solver attained a
further 2.45 times speedup from P100 GPUs. As a
demonstrative example, we computed 368 cases of crustal
deformation analyses of northeast Japan with 400 million
degrees of freedom. The total procedure of algorithm
modification and porting implementation took only two
weeks; we can see that high performance improvement was
achieved with low development cost. With the developed
solver, we can expect improvement in reliability of
crust-deformation analyses by many-case analyses on a
wide range of GPU-based systems.




