Browsing Ponencias by Subject "RESILIENCE"
Now showing items 1-2 of 2
-
Towards a model to estimate the reliability of large-scale hybrid supercomputers
(Springer Nature, 2020-08-18)Supercomputers stand as a fundamental tool for developing our understanding of the universe. State-of-the-art scientific simulations, big data analyses, and machine learning executions require high performance computing ... -
Understanding soft error sensitivity of deep learning models and frameworks through checkpoint alteration
(Institute of Electrical and Electronics Engineers (IEEE), 2021-10-13)The convergence of artificial intelligence, highperformance computing (HPC), and data science brings unique opportunities for marked advance discoveries and that leverage synergies across scientific domains. Recently, deep ...