Characterizing Faults, Errors, and Failures in
Extreme-Scale Systems
Session Leader
Additional Session Leader
Event Type
Birds of a Feather
HPC Center Planning and Operations
Reliability
Resiliency
TimeWednesday, November 15th12:15pm -
1:15pm
Location503-504
DescriptionThis session brings together a group of international
experts from the Accelerated Data Analytics and
Computing Institute to present their efforts in
characterizing faults, errors, and failures in
extreme-scale systems and to discuss practical
experiences with software tools and infrastructures,
including operational aspects. The ADAC Institute is a
collaboration between Oak Ridge National Laboratory, the
Swiss Federal Institute of Technology Zurich, Tokyo
Institute of Technology, Lawrence Livermore National
Laboratory, Juelich Research Centre, the University of
Tokyo, Cray, Nvidia, and Intel. The session includes
short presentations and a discussion that focuses on
future research and development, collaboration
opportunities, and vendor interactions.
Session Leader
Additional Session Leader




