Now showing items 1-2 of 2

    • Evaluating resilience of deep learning models 

      Rojas, Elvis; Nicolae, Bogdan; Meneses, Esteban (Instituto Tecnológico de Costa Rica, 2020)
      Deep learning applications have become a valuable tool to solve complex problems in many critical areas. It is important to provide reliability on the outputs of those applications, even if failures occur during execution. ...
    • Understanding failures through the lifetime of a top-level supercomputer 

      Rojas, Elvis; Meneses, Esteban; Jones, Terry; Maxwell, Don (Academic Press Inc., 2021-04-20)
      High performance computing systems are required to solve grand challenges in many scientific disciplines. These systems assemble many components to be powerful enough for solving extremely complex problems. An inherent ...