P02: Strassen's Algorithm for Tensor Contraction
SessionPoster Reception
Event Type
ACM Student Research Competition
Poster
Reception
TimeTuesday, November 14th5:15pm -
7pm
LocationFour Seasons Ballroom
DescriptionTensor contraction(TC) is an important computational
kernel widely used in numerous applications. It is a
multi-dimensional generalization of matrix
multiplication(GEMM). While Strassen's algorithm for
GEMM is well studied in theory and practice, extending
it to accelerate TC has not been previously pursued.
Thus, we believe this to be the first work to
demonstrate how one can in practice speed up tensor
contraction with Strassen's algorithm. By adopting a
Block-Scatter-Matrix format, a novel matrix-centric
tensor layout, we can conceptually view TC as GEMM for
general stride storage, with an implicit
tensor-to-matrix transformation. This insight enables us
to tailor a recent state-of-the-art implementation of
Strassen's algorithm to TC, avoiding explicit
transpositions(permutations) and extra workspace, and
reducing the overhead of memory movement that is
incurred. Performance benefits are demonstrated with a
performance model as well as in practice on modern
single core, multicore, and distributed memory parallel
architectures, achieving up to 1.3× speedup.




