Fault Modeling of Extreme Scale Applications Using Machine Learning
暂无分享,去创建一个
[1] Sudhanva Gurumurthi,et al. Feng Shui of supercomputer memory positional effects in DRAM and SRAM faults , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[2] David E. Bernholdt,et al. High performance computational chemistry: An overview of NWChem a distributed parallel application , 2000 .
[3] David R. Kaeli,et al. Quantifying software vulnerability , 2008, WREFT '08.
[4] Zizhong Chen,et al. Correcting soft errors online in LU factorization , 2013, HPDC '13.
[5] Amith R. Mamidala,et al. Automatic Path Migration over InfiniBand: Early Experiences , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[6] Abhinav Vishnu,et al. A Software Based Approach for Providing Network Fault Tolerance in Clusters with uDAPL interface: MPI Level Design and Performance Evaluation , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[7] Ian Karlin,et al. LULESH 2.0 Updates and Changes , 2013 .
[8] Amith R. Mamidala,et al. Topology agnostic hot‐spot avoidance with InfiniBand , 2009, Concurr. Comput. Pract. Exp..
[10] Song Fu,et al. F-SEFI: A Fine-Grained Soft Error Fault Injection Tool for Profiling Application Vulnerability , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[11] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[12] Dong Li,et al. Classifying soft error vulnerabilities in extreme-Scale scientific applications using a binary instrumentation tool , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[13] Harish Patil,et al. PinADX: an interface for customizable debugging with dynamic instrumentation , 2012, CGO '12.
[14] Dong Li,et al. Quantitatively Modeling Application Resilience with the Data Vulnerability Factor , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[15] Martin Schulz,et al. Fault resilience of the algebraic multi-grid solver , 2012, ICS '12.
[16] Shuaiwen Song,et al. Fault-tolerant communication runtime support for data-centric programming models , 2010, 2010 International Conference on High Performance Computing.
[17] Karthik Pattabiraman,et al. Soft-LLFI: A Comprehensive Framework for Software Fault Injection , 2014, 2014 IEEE International Symposium on Software Reliability Engineering Workshops.
[18] Shuaiwen Song,et al. Designing energy efficient communication runtime systems: a view from PGAS models , 2013, The Journal of Supercomputing.
[19] Balázs Kégl,et al. The Higgs boson machine learning challenge , 2014, HEPML@NIPS.
[20] Shubhendu S. Mukherjee,et al. Measuring Architectural Vulnerability Factors , 2003, IEEE Micro.
[21] Bianca Schroeder,et al. A Large-Scale Study of Failures in High-Performance Computing Systems , 2010, IEEE Trans. Dependable Secur. Comput..
[22] Bronis R. de Supinski,et al. Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[23] Robert J. Harrison,et al. Liquid water: obtaining the right answer for the right reasons , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[24] Shuaiwen Song,et al. Designing Energy Efficient Communication Runtime Systems for Data Centric Programming Models , 2010, 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing.
[25] William Gropp,et al. Fault Tolerance in Message Passing Interface Programs , 2004, Int. J. High Perform. Comput. Appl..
[26] Abhinav Vishnu,et al. A Case for Soft Error Detection and Correction in Computational Chemistry. , 2013, Journal of chemical theory and computation.
[27] Bronis R. de Supinski,et al. Soft error vulnerability of iterative linear algebra methods , 2007, ICS '08.
[28] Padma Raghavan,et al. Characterizing the impact of soft errors on iterative methods in scientific computing , 2011, ICS '11.
[29] Vilas Sridharan,et al. A study of DRAM failures in the field , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[30] Greg Bronevetsky,et al. Proceedings of the 2008 workshop on Radiation effects and fault tolerance in nanometer technologies , 2008 .
[31] Abhinav Vishnu,et al. Designing a Scalable Fault Tolerance Model for High Performance Computational Chemistry: A Case Study with Coupled Cluster Perturbative Triples. , 2011, Journal of chemical theory and computation.