Detecting failures in distributed systems with the Falcon spy network
暂无分享,去创建一个
Marcos K. Aguilera | Hao Wu | Michael Walfish | Wei-Lun Hung | Joshua B. Leners | M. Aguilera | Michael Walfish | Hao-Che Wu | W. Hung
[1] Yair Amir,et al. Paxos for System Builders: an overview , 2008, LADIS '08.
[2] Naohiro Hayashibara,et al. The φ Accrual Failure Detector , 2004 .
[3] 刘锋,et al. Kernel-based virtual machine事件跟踪机制的设计与实现 , 2008 .
[4] Marcos K. Aguilera,et al. No Time for Asynchrony , 2009, HotOS.
[5] Leslie Lamport,et al. The part-time parliament , 1998, TOCS.
[6] George Candea,et al. Microreboot - A Technique for Cheap Recovery , 2004, OSDI.
[7] Wilson C. Hsieh,et al. Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.
[8] Robert Tappan Morris,et al. Flexible, Wide-Area Storage for Distributed Systems with WheelFS , 2009, NSDI.
[9] Muli Ben-Yehuda,et al. The Turtles Project: Design and Implementation of Nested Virtualization , 2010, OSDI.
[10] Butler W. Lampson,et al. The ABCD's of Paxos , 2001, PODC '01.
[11] Robbert van Renesse,et al. Chain Replication for Supporting High Throughput and Availability , 2004, OSDI.
[12] Dutch T. Meyer,et al. Remus: High Availability via Asynchronous Virtual Machine Replication. (Best Paper) , 2008, NSDI.
[13] Leslie Lamport,et al. Paxos Made Simple , 2001 .
[14] David Mazières. Paxos Made Practical , 2007 .
[15] Nancy A. Lynch,et al. Consensus in the presence of partial synchrony , 1988, JACM.
[16] Brett D. Fleisch,et al. The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.
[17] Keith Marzullo,et al. Mencius: Building Efficient Replicated State Machine for WANs , 2008, OSDI.
[18] Chandramohan A. Thekkath,et al. Petal: distributed virtual disks , 1996, ASPLOS VII.
[19] Paulo Veríssimo. Uncertainty and predictability: can they be reconciled? , 2003 .
[20] Sanjay Ghemawat,et al. MapReduce: simplified data processing on large clusters , 2008, CACM.
[21] Christof Fetzer,et al. Perfect Failure Detection in Timed Asynchronous Systems , 2003, IEEE Trans. Computers.
[22] Antonio Casimiro,et al. The timely computing base: Timely actions in the presence of uncertain timeliness , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.
[23] J. D. Day,et al. A principle for resilient sharing of distributed resources , 1976, ICSE '76.
[24] Marcos K. Aguilera,et al. On the quality of service of failure detectors , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.
[25] Jerome H. Saltzer,et al. End-to-end arguments in system design , 1984, TOCS.
[26] Robert Griesemer,et al. Paxos made live: an engineering perspective , 2007, PODC '07.
[27] Lorenzo Alvisi,et al. The Paxos Register , 2007, 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007).
[28] Mahadev Konar,et al. ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.
[29] Antonio Casimiro,et al. The Timely Computing Base Model and Architecture , 2002, IEEE Trans. Computers.
[30] GhemawatSanjay,et al. The Google file system , 2003 .
[31] Sam Toueg,et al. Unreliable failure detectors for reliable distributed systems , 1996, JACM.
[32] Nancy A. Lynch,et al. Impossibility of distributed consensus with one faulty process , 1983, PODS '83.
[33] Robbert van Renesse,et al. A Gossip-Style Failure Detection Service , 2009 .
[34] Arun Venkataramani,et al. Consensus Routing: The Internet as a Distributed System. (Best Paper) , 2008, NSDI.
[35] Werner Vogels,et al. Dynamo: amazon's highly available key-value store , 2007, SOSP.
[36] Kenneth P. Birman,et al. Exploiting virtual synchrony in distributed systems , 1987, SOSP '87.
[37] Marcos K. Aguilera,et al. On the Impact of Fast Failure Detectors on Real-Time Fault-Tolerant Systems , 2002, DISC.
[38] Peng Li,et al. Paxos Replicated State Machines as the Basis of a High-Performance Data Store , 2011, NSDI.
[39] Mikel Larrea,et al. On the impossibility of implementing perpetual failure detectors in partially synchronous systems , 2002, Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing.
[40] Marc Najork,et al. Boxwood: Abstractions as the Foundation for Storage Infrastructure , 2004, OSDI.
[41] Paulo Veríssimo,et al. Uncertainty and Predictability: Can They Be Reconciled? , 2003, Future Directions in Distributed Computing.
[42] Nancy A. Lynch,et al. Revisiting the PAXOS algorithm , 1997, Theor. Comput. Sci..
[43] Yuan Yu,et al. Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.
[44] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[45] Pierre Sens,et al. Implementation and performance evaluation of an adaptable failure detector , 2002, Proceedings International Conference on Dependable Systems and Networks.
[46] George Candea,et al. Improving availability with recursive microreboots: a soft-state system case study , 2004, Perform. Evaluation.