Application-Based Fault Tolerance Techniques for Fully Protecting Sparse Matrix Solvers
暂无分享,去创建一个
[1] Simon McIntosh-Smith,et al. Software-level Fault Tolerant Framework for Task-based Applications , 2016, HiPC 2016.
[2] F. Mueller,et al. Quantifying the Impact of Single Bit Flips on Floating Point Arithmetic , 2013 .
[3] J. Ziegler,et al. Effect of Cosmic Rays on Computer Memories , 1979, Science.
[4] Vilas Sridharan,et al. A study of DRAM failures in the field , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[5] Eduardo Pinheiro,et al. DRAM errors in the wild: a large-scale field study , 2009, SIGMETRICS '09.
[6] Sudhanva Gurumurthi,et al. Feng Shui of supercomputer memory positional effects in DRAM and SRAM faults , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[7] Luigi Carro,et al. GPGPUs: How to combine high computational power with high reliability , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[8] Philip Koopman,et al. 32-bit cyclic redundancy codes for Internet applications , 2002, Proceedings International Conference on Dependable Systems and Networks.
[9] John Shalf,et al. Memory Errors in Modern Systems: The Good, The Bad, and The Ugly , 2015, ASPLOS.
[10] Richard W. Hamming,et al. Error detecting and error correcting codes , 1950 .
[11] Timothy J. Dell,et al. A white paper on the benefits of chipkill-correct ecc for pc server main memory , 1997 .
[12] Unsal Osman,et al. Unprotected Computing: A Large-Scale Study of DRAM Raw Error Rate on a Supercomputer , 2016 .
[13] Luigi Carro,et al. Understanding GPU errors on large-scale HPC systems and the implications for system design and operation , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[14] Simon McIntosh-Smith,et al. Application-based fault tolerance techniques for sparse matrix solvers , 2018, Int. J. High Perform. Comput. Appl..
[15] Doe Hyun Yoon,et al. Virtualized and flexible ECC for main memory , 2010, ASPLOS XV.
[16] L. Borucki,et al. Comparison of accelerated DRAM soft error rates measured at component and system level , 2008, 2008 IEEE International Reliability Physics Symposium.
[17] Bin Nie,et al. A large-scale study of soft-errors on GPUs in the field , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).