Physics-Based Checksums for Silent-Error Detection in PDE Solvers
暂无分享,去创建一个
[1] Shuaiwen Song,et al. New-Sum: A Novel Online ABFT Scheme For General Iterative Methods , 2016, HPDC.
[2] Franck Cappello,et al. MACORD: Online Adaptive Machine Learning Framework for Silent Error Detection , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).
[3] John Daly. A Model for Predicting the Optimum Checkpoint Interval for Restart Dumps , 2003, International Conference on Computational Science.
[4] Martin C. Rinard. Parallel Synchronization-Free Approximate Data Structure Construction , 2013, HotPar.
[5] Manish Parashar,et al. Local recovery and failure masking for stencil-based applications at extreme scales , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[6] Christian Engelmann,et al. Resilience Design Patterns: A Structured Approach to Resilience at Extreme Scale , 2016, Supercomput. Front. Innov..
[7] Franck Cappello,et al. Toward Exascale Resilience: 2014 update , 2014, Supercomput. Front. Innov..
[8] Robert C. Armstrong,et al. In-Situ Mitigation of Silent Data Corruption in PDE Solvers , 2016, FTXS@HPDC.
[9] Kurt B. Ferreira,et al. Fault-tolerant linear solvers via selective reliability , 2012, ArXiv.
[10] Vivek Sarkar,et al. ASC CSSE Level 2 Milestone #6362: Resilient Asynchronous Many Task Programming Model. , 2018 .
[11] Sandia Report,et al. Improving Performance via Mini-applications , 2009 .
[12] Yves Robert,et al. Which Verification for Soft Error Detection? , 2015, 2015 IEEE 22nd International Conference on High Performance Computing (HiPC).
[13] Austin R. Benson,et al. Silent error detection in numerical time-stepping schemes , 2015, Int. J. High Perform. Comput. Appl..