P47: Understanding Congestion on Omni-Path Fabrics
SessionPoster Reception
Event Type
ACM Student Research Competition
Poster
Reception
TimeTuesday, November 14th5:15pm -
7pm
LocationFour Seasons Ballroom
DescriptionHigh-performance computing systems require high-speed
interconnects, such as InfiniBand (IB), to efficiently
transmit data. Intel’s Omni-Path Architecture (OPA) is a
new interconnect similar to IB that is implemented on
some of Los Alamos National Laboratory’s recent
clusters. Both interconnects suffer from degraded
performance under heavy network traffic loads, resulting
in packet discards. However, unlike IB, OPA specifically
calls out these drops in the form of the performance
counter, congestion discards. Owing to the relative
immaturity of the OPA fabric technology, the correlation
between performance degradation and congestion discards
has not been fully evaluated to date. This research aims
to increase the level of understanding of the effects
congestion has on cluster performance by presenting a
sufficiently high data injection load to the OPA fabric
such that performance degradation is induced and the
cause of this performance degradation can be evaluated.
LA-UR-17-26341




