Browsing by Subject "Fault tolerance, resilience, failure analysis, high performance computing."
Now showing items 1-1 of 1
-
Analyzing a Five-year Failure Record of a Leadership-class Supercomputer
(Institute of Electrical and Electronics Engineers, Incorporated (IEEE), 2019-10-18)Extreme-scale computing systems are required to solve some of the grand challenges in science and technology. From astrophysics to molecular biology, supercomputers are an essential tool to accelerate scientific discovery. ...