Scalable Error Isolation for Distributed Systems
暂无分享,去创建一个
Christof Fetzer | Sergei Arnautov | Flavio Paiva Junqueira | Marco Serafini | Diogo Behrens | F. Junqueira | C. Fetzer | M. Serafini | Sergei Arnautov | Diogo Behrens
[1] Fan Yang,et al. Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing , 2014, Proc. VLDB Endow..
[2] Song Jiang,et al. Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.
[3] Amin Ansari,et al. Shoestring: probabilistic soft error reliability on the cheap , 2010, ASPLOS 2010.
[4] Eduardo Pinheiro,et al. DRAM errors in the wild: a large-scale field study , 2009, SIGMETRICS '09.
[5] Shekhar Y. Borkar,et al. Designing reliable systems from unreliable components: the challenges of transistor variability and degradation , 2005, IEEE Micro.
[6] Miguel Castro,et al. Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.
[7] Leslie Lamport,et al. The part-time parliament , 1998, TOCS.
[8] Lorenzo Alvisi,et al. Modeling the effect of technology trends on the soft error rate of combinational logic , 2002, Proceedings International Conference on Dependable Systems and Networks.
[9] Miguel Correia,et al. Practical Hardening of Crash-Tolerant Systems , 2012, USENIX Annual Technical Conference.
[10] Priya Narasimhan,et al. Thema: Byzantine-fault-tolerant middleware for Web-service applications , 2005, 24th IEEE Symposium on Reliable Distributed Systems (SRDS'05).
[11] Bianca Schroeder,et al. Cosmic rays don't strike twice: understanding the nature of DRAM errors and the implications for system design , 2012, ASPLOS XVII.
[12] Brett D. Fleisch,et al. The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.
[13] Mark Bickford,et al. Nysiad: Practical Protocol Transformation to Tolerate Byzantine Failures , 2008, NSDI.
[14] Tony Tung,et al. Scaling Memcache at Facebook , 2013, NSDI.
[15] Edward J. McCluskey,et al. Error detection by duplicated instructions in super-scalar processors , 2002, IEEE Trans. Reliab..
[16] Robert Griesemer,et al. Paxos made live: an engineering perspective , 2007, PODC '07.
[17] Christof Fetzer,et al. HardPaxos: Replication Hardened against Hardware Errors , 2014, 2014 IEEE 33rd International Symposium on Reliable Distributed Systems.
[18] Leslie Lamport,et al. The Byzantine Generals Problem , 1982, TOPL.
[19] Yang Wang,et al. All about Eve: Execute-Verify Replication for Multi-Core Servers , 2012, OSDI.
[20] Christopher Frost,et al. Spanner: Google's Globally-Distributed Database , 2012, OSDI.
[21] Shekhar Y. Borkar,et al. Microarchitecture and Design Challenges for Gigascale Integration , 2004, MICRO.
[22] John R. Douceur,et al. Cycles, cells and platters: an empirical analysisof hardware failures on a million consumer PCs , 2011, EuroSys '11.
[23] Cristian Constantinescu,et al. Trends and Challenges in VLSI Circuit Reliability , 2003, IEEE Micro.
[24] Ramakrishna Kotla,et al. Zyzzyva: speculative byzantine fault tolerance , 2007, TOCS.
[25] Kunle Olukotun,et al. The Future of Microprocessors , 2005, ACM Queue.
[26] Amin Ansari,et al. Shoestring: probabilistic soft error reliability on the cheap , 2010, ASPLOS XV.
[27] Pramod Bhatotia,et al. Reliable data-center scale computations , 2010, LADIS '10.
[28] Mahadev Konar,et al. ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.
[29] David Walker,et al. Fault-tolerant typed assembly language , 2007, PLDI '07.
[30] Wouter Joosen,et al. Bitsquatting: exploiting bit-flips for fun, or profit? , 2013, WWW.
[31] Yawei Li,et al. Megastore: Providing Scalable, Highly Available Storage for Interactive Services , 2011, CIDR.
[32] Hui Ding,et al. TAO: how facebook serves the social graph , 2012, SIGMOD Conference.
[33] Miguel Castro,et al. Using abstraction to improve fault tolerance , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.
[34] Miguel Castro,et al. BASE: using abstraction to improve fault tolerance , 2001, SOSP.
[35] David I. August,et al. SWIFT: software implemented fault tolerance , 2005, International Symposium on Code Generation and Optimization.
[36] Christof Fetzer,et al. Automatically Tolerating Arbitrary Faults in Non-malicious Settings , 2013, 2013 Sixth Latin-American Symposium on Dependable Computing.
[37] Ramakrishna Kotla,et al. Zyzzyva , 2007, SOSP.
[38] Ravishankar K. Iyer,et al. Group communication protocols under errors , 2003, 22nd International Symposium on Reliable Distributed Systems, 2003. Proceedings..
[39] Marc Hamilton,et al. Software Development: Building Reliable Systems , 1999 .
[40] Christof Fetzer,et al. Towards transparent hardening of distributed systems , 2013, HotDep.
[41] Lisa Spainhower,et al. Commercial fault tolerance: a tale of two systems , 2004, IEEE Transactions on Dependable and Secure Computing.
[42] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.