Presentation

· Presenters · Organizations · Search Program

ACM Student Research Competition

Poster

Reception

: P08: Performance Optimization of Matrix-free Finite-Element Algorithms within deal.II

SessionPoster Reception

Authors

Martin Kronbichler

Karl Ljungkvist

Momme Allalen

Martin Ohlerich

Igor Pasichnyk

Wolfgang A. Wall

Event Type

ACM Student Research Competition

Poster

Reception

Tags

TimeTuesday, November 14th5:15pm - 7pm

LocationFour Seasons Ballroom

DescriptionWe present a performance comparison of highly tuned matrix-free finite element kernels from the deal.II finite element library on three contemporary computer architectures, an NVIDIA P100 GPU, an Intel Knights Landing Xeon Phi, and two multi-core Intel CPUs. The algorithms are based on fast integration on hexahedra using sum factorization techniques. On Cartesian meshes with a relatively high arithmetic intensity, the four architectures provide a surprisingly similar computational throughput. On curved meshes, the kernel is heavily memory bandwidth limited which reveals distinct differences between the architectures: the P100 is twice as fast as KNL, and almost four times as fast as the Haswell and Broadwell CPUs, effectively leveraging the higher memory bandwidth and the favorable shared memory programming model on the GPU.

Authors

Martin Kronbichler

Technical University Munich