An empirical study on crash recovery bugs in large-scale distributed systems
暂无分享,去创建一个
Dong Wang | Jun Wei | Li Zhou | Feng Qin | Wensheng Dou | Chushu Gao | Yu Gao | Yongming Wu | Ruirui Huang | Dong Wang | Wensheng Dou | Chushu Gao | Jun Wei | Yu Gao | Feng Qin | Ruirui Huang | Li Zhou | Yongming Wu
[1] Carlo Curino,et al. Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.
[2] Xuezheng Liu,et al. D3S: Debugging Deployed Distributed Systems , 2008, NSDI.
[3] Andrea C. Arpaci-Dusseau,et al. All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications , 2014, OSDI.
[4] Adam Chlipala,et al. Using Crash Hoare logic for certifying the FSCQ file system , 2015, USENIX Annual Technical Conference.
[5] Brett D. Fleisch,et al. The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.
[6] GhemawatSanjay,et al. The Google file system , 2003 .
[7] Julian Stanley. The two biggest NQT challenges , 2017 .
[8] Junfeng Yang,et al. Using model checking to find serious file system errors , 2004, TOCS.
[9] Feng Li,et al. CloudRaid: hunting concurrency bugs in the cloud via log-mining , 2018, ESEC/SIGSOFT FSE.
[10] Wei Xu,et al. What Can We Learn from Four Years of Data Center Hardware Failures? , 2017, 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).
[11] Shan Lu,et al. Understanding Real-World Timeout Problems in Cloud Server Systems , 2018, 2018 IEEE International Conference on Cloud Engineering (IC2E).
[12] Yu Luo,et al. Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems , 2014, OSDI.
[13] Haoxiang Lin,et al. MODIST: Transparent Model Checking of Unmodified Distributed Systems , 2009, NSDI.
[14] Srinath T. V. Setty,et al. IronFleet: proving practical distributed systems correct , 2015, SOSP.
[15] Adam Chlipala,et al. Chapar: certified causally consistent distributed key-value stores , 2016, POPL.
[16] Shan Lu,et al. FCatch: Automatically Detecting Time-of-fault Bugs in Cloud Systems , 2018, ASPLOS.
[17] Xi Wang,et al. Verdi: a framework for implementing and formally verifying distributed systems , 2015, PLDI.
[18] Mahadev Konar,et al. ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.
[19] George C. Necula,et al. Minimizing Faulty Executions of Distributed Systems , 2016, NSDI.
[20] Wilson C. Hsieh,et al. Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.
[21] Andrea C. Arpaci-Dusseau,et al. SQCK: A Declarative File System Checker , 2008, OSDI.
[22] Patrice Godefroid,et al. Dynamic partial-order reduction for model checking software , 2005, POPL '05.
[23] Yingwei Luo,et al. Failure Recovery: When the Cure Is Worse Than the Disease , 2013, HotOS.
[24] Junfeng Yang,et al. Reducing crash recoverability to reachability , 2016, POPL.
[25] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[26] Randy H. Katz,et al. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.
[27] Andreas Zeller,et al. Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..
[28] Andrea C. Arpaci-Dusseau,et al. Redundancy Does Not Imply Fault Tolerance: Analysis of Distributed Storage Reactions to Single Errors and Corruptions , 2017, FAST.
[29] Koushik Sen,et al. PREFAIL: a programmable tool for multiple-failure injection , 2011, OOPSLA '11.
[30] Pallavi Joshi,et al. SAMC: Semantic-Aware Model Checking for Fast Discovery of Deep Bugs in Cloud Systems , 2014, OSDI.
[31] Adam Silberstein,et al. Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.
[32] Naohiro Hayashibara,et al. The φ Accrual Failure Detector , 2004 .
[33] Pallavi Joshi,et al. SETSUDŌ: perturbation-based testing framework for scalable distributed systems , 2013, TRIOS@SOSP.
[34] Joseph M. Hellerstein,et al. Lineage-driven Fault Injection , 2015, SIGMOD Conference.
[35] Werner Vogels,et al. Dynamo: amazon's highly available key-value store , 2007, SOSP.
[36] Shan Lu,et al. TaxDC: A Taxonomy of Non-Deterministic Concurrency Bugs in Datacenter Distributed Systems , 2016, ASPLOS.
[37] Mark Lillibridge,et al. Torturing Databases for Fun and Profit , 2014, OSDI.
[38] Junfeng Yang,et al. Practical software model checking via dynamic interface reduction , 2011, SOSP.
[39] Tanakorn Leesatapornwongsa,et al. What Bugs Live in the Cloud? A Study of 3000+ Issues in Cloud Systems , 2014, SoCC.
[40] Michael I. Jordan,et al. Detecting large-scale system problems by mining console logs , 2009, SOSP '09.
[41] Xi Wang,et al. An Empirical Study on the Correctness of Formally Verified Distributed Systems , 2017, EuroSys.
[42] Andrea C. Arpaci-Dusseau,et al. Correlated Crash Vulnerabilities , 2016, OSDI.
[43] Koushik Sen,et al. Automated Systematic Testing of Open Distributed Programs , 2006, FASE.
[44] Shan Lu,et al. DCatch: Automatically Detecting Distributed Concurrency Bugs in Cloud Systems , 2017, ASPLOS.
[45] Andrea C. Arpaci-Dusseau,et al. FATE and DESTINI: A Framework for Cloud Recovery Testing , 2011, NSDI.