Biography
Nathan conducts resilience and fault-tolerance research
with research scientists, students, and visiting
professors. Among interests in hardware reliability,
algorithmic design, and a general interest in computing on
(perhaps extremely) unreliable hardware, Nathan is project
lead of the Fine-Grained Soft Error Fault Injection
(F-SEFI) framework. F-SEFI is a tool for exploring how
real applications running on real systems tolerate
emulated soft errors. The tool injects soft errors with
extreme precision at specific points in a running
application on real hardware, with real OS kernels, and
real middleware. F-SEFI builds on an open source virtual
machine and processor emulator to emulate faulty hardware
but does so only in ways to affect the application of
interest, thereby making it more tractable to study how
applications respond to specific types of soft errors.
Additionally, Nathan conducts field studies on DOE
supercomputers to study memory and processor resilience
including correctable faults and uncorrectable errors.
Presentations
ACM Student Research Competition
Poster
Reception




