Operating system support for redundant multithreading
暂无分享,去创建一个
[1] Richard D. Schlichting,et al. Fail-stop processors: an approach to designing fault-tolerant computing systems , 1983, TOCS.
[2] Xin Li,et al. A Memory Soft Error Measurement on Production Systems , 2007, USENIX Annual Technical Conference.
[3] Hsien-Hsin S. Lee,et al. 3D-MAPS: 3D Massively parallel processor with stacked memory , 2012, 2012 IEEE International Solid-State Circuits Conference.
[4] Fred B. Schneider,et al. Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.
[5] George Candea,et al. Microreboot - A Technique for Cheap Recovery , 2004, OSDI.
[6] Christof Fetzer,et al. ANB- and ANBDmem-Encoding: Detecting Hardware Errors in Software , 2010, SAFECOMP.
[7] Todd M. Austin,et al. DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[8] Doug Lea,et al. Concurrent Programming In Java , 1996 .
[9] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[10] Jacob A. Abraham,et al. Algorithm-Based Fault Tolerance for Matrix Operations , 1984, IEEE Transactions on Computers.
[11] Scott A. Mahlke,et al. Runtime asynchronous fault tolerance via speculation , 2012, CGO '12.
[12] Edward A. Lee. The problem with threads , 2006, Computer.
[13] Michael Norrish,et al. seL4: formal verification of an OS kernel , 2009, SOSP '09.
[14] Puneet Gupta,et al. Hardware Variability-Aware Duty Cycling for Embedded Sensors , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[15] Stefan Götz,et al. Unmodified Device Driver Reuse and Improved System Dependability via Virtual Machines , 2004, OSDI.
[16] D. B. Davis,et al. Intel Corp. , 1993 .
[17] Takeshi Yoshimura,et al. Is Linux Kernel Oops Useful or Not? , 2012, HotDep.
[18] Anoop Gupta,et al. Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, ISCA '90.
[19] John P. Hayes,et al. Low-cost on-line fault detection using control flow assertions , 2003, 9th IEEE On-Line Testing Symposium, 2003. IOLTS 2003..
[20] John R. Levine. Linkers and Loaders , 1999 .
[21] Jeffrey Overbey,et al. A type and effect system for deterministic parallel Java , 2009, OOPSLA 2009.
[22] Quinn Jacobson,et al. ERSA: error resilient system architecture for probabilistic applications , 2010, DATE 2010.
[23] Gerald J. Popek,et al. Formal requirements for virtualizable third generation architectures , 1974, SOSP '73.
[24] Karthik Pattabiraman,et al. Towards understanding the effects of intermittent hardware faults on programs , 2010, 2010 International Conference on Dependable Systems and Networks Workshops (DSN-W).
[25] Michael Engel,et al. Investigating the Limitations of PVF for Realistic Program Vulnerability Assessment , 2012 .
[26] Andrew M. Tyrrell. Recovery blocks and algorithm-based fault tolerance , 1996, Proceedings of EUROMICRO 96. 22nd Euromicro Conference. Beyond 2000: Hardware and Software Design Strategies.
[27] Christophe Calvès,et al. Faults in linux: ten years later , 2011, ASPLOS XVI.
[28] Y. C. Yeh,et al. Triple-triple redundant 777 primary flight computer , 1996, 1996 IEEE Aerospace Applications Conference. Proceedings.
[29] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[30] Ravishankar K. Iyer,et al. Active replication of multithreaded applications , 2006, IEEE Transactions on Parallel and Distributed Systems.
[31] Udo Steinberg,et al. NOVA: a microhypervisor-based secure virtualization architecture , 2010, EuroSys '10.
[32] Samuel T. King,et al. Recovery domains: an organizing principle for recoverable operating systems , 2009, ASPLOS.
[33] Tipp Moseley,et al. Using Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).
[34] Edward J. McCluskey,et al. Control-flow checking by software signatures , 2002, IEEE Trans. Reliab..
[35] James Reinders,et al. Intel Xeon Phi Coprocessor High Performance Programming , 2013 .
[36] Alan Wood,et al. The impact of new technology on soft error rates , 2011, 2011 International Reliability Physics Symposium.
[37] Bogdan M. Wilamowski,et al. The Transmission Control Protocol , 2005, The Industrial Information Technology Handbook.
[38] David I. August,et al. SWIFT: software implemented fault tolerance , 2005, International Symposium on Code Generation and Optimization.
[39] Matt Davis. Creating a vDSO: the colonel's other chicken , 2011 .
[40] Narayanan Ganapathy,et al. General Purpose Operating System Support for Multiple Page Sizes , 1998, USENIX Annual Technical Conference.
[41] Christof Fetzer,et al. AN-Encoding Compiler: Building Safety-Critical Systems with Commodity Hardware , 2009, SAFECOMP.
[42] David García,et al. NonStop/spl reg/ advanced architecture , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).
[43] Tryggve Fossum,et al. Cache scrubbing in microprocessors: myth or necessity? , 2004, 10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings..
[44] Ralph Johnson,et al. design patterns elements of reusable object oriented software , 2019 .
[45] Leonid Ryzhyk,et al. Automatic device driver synthesis with termite , 2009, SOSP '09.
[46] Asim Kadav,et al. Tolerating hardware device failures in software , 2009, SOSP '09.
[47] Hermann Härtig,et al. Position summary: a streaming interface for real-time interprocess communication , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.
[48] Raphael R. Some,et al. Experimental evaluation of a COTS system for space applications , 2002, Proceedings International Conference on Dependable Systems and Networks.
[49] A. Taber,et al. Single event upset in avionics , 1993 .
[50] Yang Wang,et al. All about Eve: Execute-Verify Replication for Multi-Core Servers , 2012, OSDI.
[51] Hermann Härtig,et al. Where Have all the Cycles Gone? - Investigating Runtime Overheads of OSAssisted Replication , 2013, GI-Jahrestagung.
[52] Dan Grossman,et al. CoreDet: a compiler and runtime system for deterministic multithreaded execution , 2010, ASPLOS 2010.
[53] Michael N. Lovellette,et al. Strategies for fault-tolerant, space-based computing: Lessons learned from the ARGOS testbed , 2002, Proceedings, IEEE Aerospace Conference.
[54] Michael S. Floyd,et al. Fault - tolerant design of the IBM POWER6™ microprocessor , 2007, 2007 IEEE Hot Chips 19 Symposium (HCS).
[55] Rolf Riesen,et al. Detection and Correction of Silent Data Corruption for Large-Scale High-Performance Computing , 2012, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[56] Marek Olszewski,et al. Kendo: efficient deterministic multithreading in software , 2009, ASPLOS.
[57] Hermann Härtig,et al. Who Watches the Watchmen? Protecting Operating System Reliability Mechanisms , 2012, HotDep.
[58] Junfeng Yang,et al. Stable Deterministic Multithreading through Schedule Memoization , 2010, OSDI.
[59] Shubhendu S. Mukherjee,et al. Transient fault detection via simultaneous multithreading , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[60] William J. Bolosky,et al. Mach: A New Kernel Foundation for UNIX Development , 1986, USENIX Summer.
[61] Todd M. Austin,et al. A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor , 2003, MICRO.
[62] Sarita V. Adve,et al. Low-cost program-level detectors for reducing silent data corruptions , 2012, IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012).
[63] Dong Zhou,et al. Rex: replication at the speed of multi-core , 2014, EuroSys '14.
[64] Sani R. Nassif. The light at the end of the CMOS tunnel , 2010, ASAP.
[65] Saibal Mukhopadhyay,et al. Leakage current mechanisms and leakage reduction techniques in deep-submicrometer CMOS circuits , 2003, Proc. IEEE.
[66] Thomas F. Knight,et al. A Minimal Trusted Computing Base for Dynamically Ensuring Secure Information Flow , 2001 .
[67] Shubu Mukherjee,et al. Architecture Design for Soft Errors , 2008 .
[68] Sarita V. Adve,et al. Relyzer: exploiting application-level fault equivalence to analyze application resiliency to transient faults , 2012, ASPLOS XVII.
[69] Jean Arlat,et al. Dependability of COTS Microkernel-Based Systems , 2002, IEEE Trans. Computers.
[70] Christof Fetzer,et al. Software-Implemented Hardware Error Detection: Costs and Gains , 2010, 2010 Third International Conference on Dependability.
[71] Rolf Ernst,et al. Designing an Analyzable and Resilient Embedded Operating System , 2012, GI-Jahrestagung.
[72] Mateo Valero,et al. FIMSIM: A fault injection infrastructure for microarchitectural simulators , 2011, 2011 IEEE 29th International Conference on Computer Design (ICCD).
[73] Ravishankar K. Iyer,et al. Error sensitivity of the Linux kernel executing on PowerPC G4 and Pentium 4 processors , 2004, International Conference on Dependable Systems and Networks, 2004.
[74] Virendra J. Marathe,et al. Callisto: co-scheduling parallel runtime systems , 2014, EuroSys '14.
[75] J. Ziegler,et al. Effect of Cosmic Rays on Computer Memories , 1979, Science.
[76] Roy H. Campbell,et al. CuriOS: Improving Reliability through Operating System Structure , 2008, OSDI.
[77] Muhammad Shafique,et al. Reliable software for unreliable hardware: Embedded code generation aiming at reliability , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).
[78] David R. Kaeli,et al. Quantifying software vulnerability , 2008, WREFT '08.
[79] Robert Baumann,et al. Soft errors in advanced computer systems , 2005, IEEE Design & Test of Computers.
[80] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[81] Sarita V. Adve,et al. Understanding the propagation of hard errors to software and implications for resilient system design , 2008, ASPLOS.
[82] Olaf Spinczyk,et al. Protecting the Dynamic Dispatch in C++ by Dependability Aspects , 2012, GI-Jahrestagung.
[83] Joel Emer,et al. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[84] Hermann Härtig,et al. Can we put concurrency back into redundant multithreading? , 2014, 2014 International Conference on Embedded Software (EMSOFT).
[85] Michael Stumm,et al. FlexSC: Flexible System Call Scheduling with Exception-Less System Calls , 2010, OSDI.
[86] Olaf Spinczyk,et al. Generative software-based memory error detection and correction for operating system data structures , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).
[87] Bryan Ford,et al. Deterministic OpenMP for Race-Free Parallelism , 2011, HotPar.
[88] Trevor Mudge,et al. Razor: a low-power pipeline based on circuit-level timing speculation , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[89] J. Maiz,et al. Characterization of multi-bit soft error events in advanced SRAMs , 2003, IEEE International Electron Devices Meeting 2003.
[90] John Paul Shen,et al. Continuous signature monitoring: low-cost concurrent detection of processor control errors , 1990, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..
[91] Josep Torrellas,et al. Light64: Lightweight hardware support for data race detection during Systematic Testing of parallel programs , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[92] Jacob A. Abraham,et al. Quantitative evaluation of soft error injection techniques for robust system design , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).
[93] Babak Falsafi,et al. Fingerprinting: Bounding Soft-Error-Detection Latency and Bandwidth , 2004, IEEE Micro.
[94] Herbert Bos,et al. Keep net working - on a dependable and fast networking stack , 2012, IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012).
[95] Jochen Liedtke,et al. Improving IPC by kernel design , 1994, SOSP '93.
[96] Jakob Engblom,et al. The worst-case execution-time problem—overview of methods and survey of tools , 2008, TECS.
[97] Maurice Herlihy,et al. The art of multiprocessor programming , 2020, PODC '06.
[98] Rolf Ernst,et al. Response-Time Analysis of Parallel Fork-Join Workloads with Real-Time Constraints , 2013, 2013 25th Euromicro Conference on Real-Time Systems.
[99] Cristiano Giuffrida,et al. We Crashed, Now What? , 2010, HotDep.
[100] Fred B. Schneider,et al. Hypervisor-based fault tolerance , 1996, TOCS.
[101] Junfeng Yang,et al. Parrot: a practical runtime for deterministic, stable, and reliable threads , 2013, SOSP.
[102] Randy H. Katz,et al. A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.
[103] Qin Zhao,et al. Practical memory checking with Dr. Memory , 2011, International Symposium on Code Generation and Optimization (CGO 2011).
[104] Luis Ceze,et al. Deterministic Process Groups in dOS , 2010, OSDI.
[105] Michael Engel,et al. Fast and Low-Cost Instruction-Aware Fault Injection , 2013, GI-Jahrestagung.
[106] Babak Falsafi,et al. Fingerprinting: bounding soft-error-detection latency and bandwidth , 2004, IEEE Micro.
[107] Edsger W. Dijkstra,et al. A note on two problems in connexion with graphs , 1959, Numerische Mathematik.
[108] Rolf Ernst,et al. IDAMC: A Many-Core Platform with Run-Time Monitoring for Mixed-Criticality , 2012, 2012 IEEE 14th International Symposium on High-Assurance Systems Engineering.
[109] Rami G. Melhem,et al. The effects of energy management on reliability in real-time embedded systems , 2004, IEEE/ACM International Conference on Computer Aided Design, 2004. ICCAD-2004..
[110] Jakob Eriksson,et al. Conversion: multi-version concurrency control for main memory segments , 2013, EuroSys '13.
[111] Joel F. Bartlett,et al. A NonStop kernel , 1981, SOSP.
[112] Norbert Wehn,et al. Reliable on-chip systems in the nano-era: Lessons learnt and future trends , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).
[113] Brian Randell,et al. Reliability Issues in Computing System Design , 1978, CSUR.
[114] Sorav Bansal,et al. Fast dynamic binary translation for the kernel , 2013, SOSP.
[115] Doug Lea,et al. Concurrent programming in Java - design principles and patterns , 1996, Java series.
[116] Muhammad Shafique,et al. Instruction scheduling for reliability-aware compilation , 2012, DAC Design Automation Conference 2012.
[117] Tipp Moseley,et al. PLR: A Software Approach to Transient Fault Tolerance for Multicore Architectures , 2009, IEEE Transactions on Dependable and Secure Computing.
[118] Jim Gray,et al. Why Do Computers Stop and What Can Be Done About It? , 1986, Symposium on Reliability in Distributed Software and Database Systems.
[119] Albert Meixner,et al. Detouring: Translating software to circumvent hard faults in simple cores , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).
[120] Julian Stecklina. Shrinking the hypervisor one subsystem at a time: a userspace packet switch for virtual machines , 2014, VEE '14.
[121] J. N. Herder,et al. Building a Dependable Operating System: Fault Tolerance in MINIX 3 , 2005 .
[122] Maurice Herlihy,et al. A methodology for implementing highly concurrent data objects , 1993, TOPL.
[123] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[124] Adam Lackorzynski,et al. L 4 Linux Porting Optimizations , 2004 .
[125] Calton Pu,et al. Buffer overflows: attacks and defenses for the vulnerability of the decade , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.
[126] David Thomas,et al. The Art in Computer Programming , 2001 .
[127] Gene Cooperman,et al. DMTCP: Transparent checkpointing for cluster computations and the desktop , 2007, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[128] Dong Li,et al. Rethinking algorithm-based fault tolerance with a cooperative software-hardware approach , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[129] James Hendricks,et al. Secure bootstrap is not enough: shoring up the trusted computing base , 2004, EW 11.
[130] Steve McConnell,et al. Code complete - a practical handbook of software construction, 2nd Edition , 1993 .
[131] G Gasiot,et al. Altitude and underground real-time SER characterization of CMOS 65nm SRAM , 2008, 2008 European Conference on Radiation and Its Effects on Components and Systems.
[132] Lingamneni Avinash,et al. Sustaining moore's law in embedded computing through probabilistic and approximate design: retrospects and prospects , 2009, CASES '09.
[133] Neal H. Walfield,et al. Viengoos: A Framework for Stakeholder-Directed Resource Allocation , 2009 .
[134] Frank Bellosa,et al. XLH: More Effective Memory Deduplication Scanners Through Cross-layer Hints , 2013, USENIX Annual Technical Conference.
[135] Carl E. Landwehr,et al. Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.
[136] Jeffrey Overbey,et al. A type and effect system for deterministic parallel Java , 2009, OOPSLA '09.
[137] Yun Zhang,et al. DAFT: decoupled acyclic fault tolerance , 2010, PACT '10.
[138] Edward J. McCluskey,et al. Executable assertions and flight software , 1984 .
[139] Donald E. Knuth,et al. The Art of Computer Programming: Volume 3: Sorting and Searching , 1998 .
[140] PuCalton,et al. Reducing TCB complexity for security-sensitive applications , 2006 .
[141] Leslie Lamport,et al. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.
[142] Dawson R. Engler,et al. Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.
[143] Nicholas Nethercote,et al. Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.
[144] Dirk Vogt,et al. Stay strong, stay safe: Enhancing Reliability of a Secure Operating System , 2010 .
[145] Leonid Ryzhyk,et al. Dingo: taming device drivers , 2009, EuroSys '09.
[146] P. Roche,et al. Altitude and Underground Real-Time SER Characterization of CMOS 65 nm SRAM , 2008, IEEE Transactions on Nuclear Science.
[147] Norbert Wehn,et al. A Cross-Layer Technology-Based Study of How Memory Errors Impact System Resilience , 2013, IEEE Micro.
[148] Trent Jaeger,et al. The SawMill framework for virtual memory diversity , 2001, Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001.
[149] Bryan Ford,et al. Workspace Consistency : A Programming Model for Shared Memory Parallelism , 2011 .
[150] Richard W. Hamming,et al. Error detecting and error correcting codes , 1950 .
[151] Y. Taur,et al. The incredible shrinking transistor , 1999, IEEE Spectrum.
[152] Dieter K. Schroder,et al. Negative bias temperature instability: What do we understand? , 2007, Microelectron. Reliab..
[153] Emery D. Berger,et al. Dthreads: efficient deterministic multithreading , 2011, SOSP.
[154] A. Asenov,et al. Analysis of Threshold Voltage Distribution Due to Random Dopants: A 100 000-Sample 3-D Simulation Study , 2009, IEEE Transactions on Electron Devices.
[155] Sen Hu,et al. Efficient system-enforced deterministic parallelism , 2010, OSDI.
[156] Julien Delange,et al. POK, an ARINC653-compliant operating system released under the BSD license , 2011 .
[157] Cheng Wang,et al. Compiler-Managed Software-based Redundant Multi-Threading for Transient Fault Detection , 2007, International Symposium on Code Generation and Optimization (CGO'07).
[158] Rüdiger Kapitza,et al. Fail∗: Towards a versatile fault-injection experiment framework , 2012, ARCS 2012.
[159] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[160] James L. Walsh,et al. IBM experiments in soft fails in computer electronics (1978-1994) , 1996, IBM J. Res. Dev..
[161] William G. Brown,et al. Improvement of Electronic-Computer Reliability through the Use of Redundancy , 1961, IRE Trans. Electron. Comput..
[162] Rolf Ernst,et al. Failure analysis of a network-on-chip for real-time mixed-critical systems , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[163] Bianca Schroeder,et al. Cosmic rays don't strike twice: understanding the nature of DRAM errors and the implications for system design , 2012, ASPLOS XVII.
[164] Satish Narayanasamy,et al. Respec: Efficient Online Multiprocessor Replay via Speculation and External Determinism , 2010, ASPLOS 2010.
[165] Mark S. Miller,et al. Capability Myths Demolished , 2003 .
[166] Calton Pu,et al. Reducing TCB complexity for security-sensitive applications: three case studies , 2006, EuroSys.
[167] A. Kivity,et al. kvm : the Linux Virtual Machine Monitor , 2007 .
[168] L. Sterpone,et al. An Analysis of SEU Effects in Embedded Operating Systems for Real-Time Applications , 2007, 2007 IEEE International Symposium on Industrial Electronics.
[169] K ReinhardtSteven,et al. Transient fault detection via simultaneous multithreading , 2000 .
[170] Brian N. Bershad,et al. Recovering device drivers , 2004, TOCS.
[171] Timothy J. Slegel,et al. IBM's S/390 G5 microprocessor design , 1999, IEEE Micro.
[172] Martin Kriegel. Bounding Error Detection Latencies for Replicated Execution , 2013 .
[173] Adam Lackorzynski,et al. Taming subsystems: capabilities as universal resource access control in L4 , 2009, IIES '09.
[174] Vilas Sridharan,et al. A study of DRAM failures in the field , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[175] Sanjay J. Patel,et al. Y-branches: when you come to a fork in the road, take it , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.
[176] Gernot Heiser,et al. From L3 to seL4 what have we learnt in 20 years of L4 microkernels? , 2013, SOSP.
[177] Tobias Distler,et al. Storyboard: Optimistic Deterministic Multithreading , 2010, HotDep.
[178] Paul D. Ezhilchelvan,et al. Implementing Fail-Silent Nodes for Distributed Systems , 1996, IEEE Trans. Computers.
[179] Matteo Frigo,et al. The implementation of the Cilk-5 multithreaded language , 1998, PLDI.
[180] Zaid Al-Ars,et al. Efficient software-based fault tolerance approach on multicore platforms , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[181] D. Varghese,et al. A comprehensive model for PMOS NBTI degradation: Recent progress , 2007, Microelectron. Reliab..
[182] Miguel Miranda. When every atom counts , 2012, IEEE Spectrum.
[183] J. Black,et al. Electromigration—A brief survey and some recent results , 1969 .
[184] Konstantin Serebryany,et al. ThreadSanitizer: data race detection in practice , 2009, WBIA '09.
[185] Emery D. Berger,et al. Grace: safe multithreaded programming for C/C++ , 2009, OOPSLA 2009.
[186] Timothy G. Mattson,et al. Light-weight communications on Intel's single-chip cloud computer processor , 2011, OPSR.
[187] J. Liou,et al. A model for MOS failure prediction due to hot-carriers injection , 1996, Proceedings 1996 IEEE Hong Kong Electron Devices Meeting.
[188] Timothy J. Dell,et al. A white paper on the benefits of chipkill-correct ecc for pc server main memory , 1997 .
[189] G.E. Moore,et al. Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.
[190] Eduardo Pinheiro,et al. DRAM errors in the wild: a large-scale field study , 2009, SIGMETRICS '09.
[191] Shekhar Y. Borkar,et al. Designing reliable systems from unreliable components: the challenges of transistor variability and degradation , 2005, IEEE Micro.
[192] J Keane,et al. An odomoeter for CPUs , 2011, IEEE Spectrum.
[193] Philip Koopman,et al. 32-bit cyclic redundancy codes for Internet applications , 2002, Proceedings International Conference on Dependable Systems and Networks.
[194] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[195] Michael Engel,et al. The Reliable Computing Base - A Paradigm for Software-based Reliability , 2012, GI-Jahrestagung.
[196] Ravishankar K. Iyer,et al. An experimental study of soft errors in microprocessors , 2005, IEEE Micro.
[197] Carsten Weinhold. jVPFS: Adding Robustness to a Secure Stacked File System with Untrusted Local Storage Components , 2011, USENIX Annual Technical Conference.
[198] Edward J. McCluskey,et al. Error detection by duplicated instructions in super-scalar processors , 2002, IEEE Trans. Reliab..
[199] Karthikeyan Sankaralingam,et al. Dark silicon and the end of multicore scaling , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[200] Trevor Mudge,et al. MiBench: A free, commercially representative embedded benchmark suite , 2001 .
[201] Michael Stumm,et al. Otherworld: giving applications a chance to survive OS kernel crashes , 2010, EuroSys '10.