Comprehensive and Efficient Runtime Checking in System Software through Watchdogs
暂无分享,去创建一个
[1] Sam Toueg,et al. Unreliable failure detectors for reliable distributed systems , 1996, JACM.
[2] Marcos K. Aguilera,et al. Improving Availability in Distributed Systems with Failure Informers , 2013, NSDI.
[3] Marcos K. Aguilera,et al. Taming uncertainty in distributed systems with help from the network , 2015, EuroSys.
[4] Andrea C. Arpaci-Dusseau,et al. Fail-stutter fault tolerance , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.
[5] George Candea,et al. Failure sketching: a technique for automated root cause diagnosis of in-production failures , 2015, SOSP.
[6] Tanakorn Leesatapornwongsa,et al. Limplock: understanding the impact of limpware on scale-out cloud systems , 2013, SoCC.
[7] Martín Abadi,et al. Control-flow integrity , 2005, CCS '05.
[8] Arnold Berger,et al. Embedded Systems Design: An Introduction to Processes, Tools, and Techniques , 2001 .
[9] Ding Yuan,et al. Pensieve: Non-Intrusive Failure Reproduction for Distributed Systems using the Event Chaining Approach , 2017, SOSP.
[10] Marcos K. Aguilera,et al. Detecting failures in distributed systems with the Falcon spy network , 2011, SOSP.
[11] Peng Huang,et al. 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018, Carlsbad, CA, USA, October 8-10, 2018 , 2018, OSDI.
[12] Miguel Correia,et al. Practical Hardening of Crash-Tolerant Systems , 2012, USENIX Annual Technical Conference.
[13] Ashutosh Gupta,et al. InvGen: An Efficient Invariant Generator , 2009, CAV.
[14] Laurie Hendren,et al. Soot: a Java bytecode optimization framework , 2010, CASCON.
[15] Edward J. McCluskey,et al. Concurrent Error Detection Using Watchdog Processors - A Survey , 1988, IEEE Trans. Computers.
[16] David W. Binkley,et al. Program slicing , 2008, 2008 Frontiers of Software Maintenance.
[17] Niall Murphy,et al. Site Reliability Engineering: How Google Runs Production Systems , 2016 .
[18] Yuanyuan Zhou,et al. Early Detection of Configuration Errors to Reduce Failure Damage , 2016, USENIX Annual Technical Conference.
[19] William G. Griswold,et al. Dynamically discovering likely program invariants to support program evolution , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).
[20] George Candea,et al. Microreboot - A Technique for Cheap Recovery , 2004, OSDI.
[21] Peng Huang,et al. Gray Failure: The Achilles' Heel of Cloud-Scale Systems , 2017, HotOS.
[22] Robert B. Ross,et al. Fail-Slow at Scale , 2018, ACM Trans. Storage.
[23] Marcos K. Aguilera,et al. No Time for Asynchrony , 2009, HotOS.
[24] Jeffrey C. Mogul,et al. Thinking about Availability in Large Service Infrastructures , 2017, HotOS.
[25] Andrea C. Arpaci-Dusseau,et al. IRON file systems , 2005, SOSP '05.