Modeling and Verifying Google File System

Google File System (GFS) is a distributed file system developed by Google for massive data-intensive applications. Its high aggregate performance of delivering massive data to many clients but the inexpensiveness of commodity hardware facilitate GFS to successfully meet the massive storage needs and be widely used in industries. In this paper, we first present a formal model of Google File System in terms of Communicating Sequential Processes (CSP#), which precisely describes the un-derlying read/write behaviors of GFS. On that basis, both relaxed consistency and eventually consistency guaranteed by GFS may be revealed in our framework. Furthermore, the suggested CSP# model is encoded in Process Analysis Toolkit (PAT), thus several properties such as starvation-free and deadlock-free could be automatically checked and verified in the framework of formal methods.

[1]  Christel Baier,et al.  Principles of Model Checking (Representation and Mind Series) , 2008 .

[2]  Ling Shi,et al.  Modeling and verifying hierarchical real-time systems using stateful timed CSP , 2013, TSEM.

[3]  James J. Kistler,et al.  Challenges, Techniques and Directions in Building XSeek: an XML Search Engine. , 2009 .

[4]  Jun Sun,et al.  PAT: Towards Flexible Verification under Fairness , 2009, CAV.

[5]  Valentin Goranko,et al.  Logic in Computer Science: Modelling and Reasoning About Systems , 2007, J. Log. Lang. Inf..

[6]  James J. Kistler,et al.  Building a Cloud for Yahoo! , 2009, IEEE Data Eng. Bull..

[7]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[8]  Jun Sun,et al.  Towards Formal Modeling and Verification of Cloud Architectures: A Case Study on Hadoop , 2013, 2013 IEEE Ninth World Congress on Services.

[9]  Ling Shi,et al.  A UTP Semantics for Communicating Processes with Shared Variables , 2013, ICFEM.

[10]  Jun Sun,et al.  Model Checking CSP Revisited: Introducing a Process Analysis Toolkit , 2008, ISoLA.

[11]  Brian Hayes,et al.  What Is Cloud Computing? , 2019, Cloud Technologies.

[12]  Qin Li,et al.  Formalizing MapReduce with CSP , 2010, 2010 17th IEEE International Conference and Workshops on Engineering of Computer Based Systems.

[13]  Sanjay Ghemawat,et al.  MapReduce: a flexible data processing tool , 2010, CACM.

[14]  Mark D. Hill,et al.  Multiprocessors Should Support Simple Memory-Consistency Models , 1998, Computer.

[15]  Werner Vogels,et al.  Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability. , 2022 .

[16]  Jun Sun,et al.  Towards Expressive Specification and Efficient Model Checking , 2009, 2009 Third IEEE International Symposium on Theoretical Aspects of Software Engineering.

[17]  Jun Sun,et al.  Towards a Combination of CafeOBJ and PAT , 2014, Specification, Algebra, and Software.

[18]  Alexandru Iosup,et al.  A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[19]  Masami Hagiya,et al.  Using Coq in Specification and Program Extraction of Hadoop MapReduce Applications , 2011, SEFM.

[20]  Rajkumar Buyya,et al.  Article in Press Future Generation Computer Systems ( ) – Future Generation Computer Systems Cloud Computing and Emerging It Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility , 2022 .

[21]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[22]  Prashant Pandey,et al.  Cloud computing , 2010, ICWET.

[23]  GhemawatSanjay,et al.  The Google file system , 2003 .