Fingerpointing correlated failures in replicated systems
暂无分享,去创建一个
[1] Miguel Oom Temudo de Castro,et al. Practical Byzantine fault tolerance , 1999, OSDI '99.
[2] Marcos K. Aguilera,et al. Performance debugging for distributed systems of black boxes , 2003, SOSP '03.
[3] Armando Fox,et al. Detecting application-level failures in component-based Internet services , 2005, IEEE Transactions on Neural Networks.
[4] Fred B. Schneider,et al. Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.
[5] Armando Fox,et al. Capturing, indexing, clustering, and retrieving system history , 2005, SOSP '05.
[6] Frank Feather,et al. A case study of Ethernet anomalies in a distributed computing environment , 1990 .
[7] Group Communication : Helping or Obscuring Failure Diagnosis ? , 2006 .
[8] Yair Amir,et al. A low latency, loss tolerant architecture and protocol for wide area group communication , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.
[9] Idit Keidar,et al. Group communication specifications: a comprehensive study , 2001, CSUR.
[10] Mike Hibler,et al. An integrated experimental environment for distributed systems and networks , 2002, OSDI '02.
[11] Isabelle Guyon,et al. A Stability Based Method for Discovering Structure in Clustered Data , 2001, Pacific Symposium on Biocomputing.
[12] Helen J. Wang,et al. Automatic Misconfiguration Troubleshooting with PeerPressure , 2004, OSDI.
[13] Amin Vahdat,et al. Pip: Detecting the Unexpected in Distributed Systems , 2006, NSDI.