Formalization and Analysis of Haystack Architecture from Process Algebra Perspective

As a storage system architecture optimized for Facebook’s photo application, Haystack has four main advantages than before, including high throughput and low latency, fault-tolerance, cost-effectiveness and simplicity. With its widespread use, its validity and other major properties abstracted from the architecture need to be analyzed in a formal framework. However, to the best of our knowledge, there is nearly no research conducted to describe the communications and properties in Haystack. In this paper, we focus on the internal design of serving and uploading a photo of Haystack architecture and apply Communicating Sequential Processes (CSP) to formalize them in detail. By feeding the models into the model checker Process Analysis Toolkit (PAT), we have verified some crucial properties, including basic property and supplementary properties. Basic property contains Deadlock Freedom. Supplementary properties include synchronous concurrent access, asynchronous concurrent access, synchronous concurrent access with the same client, synchronous concurrent upload and synchronous concurrent upload with the same client. Finally, according to the verification results, we believe that from the CSP’s perspective, the properties of Haystack architecture is valid, which means that it meets the requirements of the documents of Facebook.

[1]  Andrew William Roscoe,et al.  The Theory and Practice of Concurrency , 1997 .

[2]  Xi Wu,et al.  Formalization and analysis of the REST architecture from the process algebra perspective , 2016, Future Gener. Comput. Syst..

[3]  Alessandro Armando,et al.  Security of Mobile Single Sign-On: A Rational Reconstruction of Facebook Login Solution , 2016, SECRYPT.

[4]  Chao Xu,et al.  Modeling and Verifying Identity Authentication Security of HDFS Using CSP , 2017, 2017 24th Asia-Pacific Software Engineering Conference (APSEC).

[5]  C. A. R. Hoare,et al.  A Theory of Communicating Sequential Processes , 1984, JACM.

[6]  Sanjeev Kumar,et al.  Finding a Needle in Haystack: Facebook's Photo Storage , 2010, OSDI.

[7]  C. A. R. Hoare,et al.  Communicating sequential processes , 1978, CACM.

[8]  Krishna P. Gummadi,et al.  Measurement, modeling, and analysis of a peer-to-peer file-sharing workload , 2003, SOSP '03.

[9]  Miroslav Popovic,et al.  Formalization and Verification of the PSTM Architecture , 2017, 2017 24th Asia-Pacific Software Engineering Conference (APSEC).

[10]  Phan Cong Vinh,et al.  Modeling and Verifying HDFS Using Process Algebra , 2017, Mob. Networks Appl..

[11]  Guilherme Piegas Koslovski,et al.  Using Externals IdPs on OpenStack: A Security Analysis of OpenID Connect, Facebook Connect, and OpenStack Authentication , 2018, 2018 IEEE 32nd International Conference on Advanced Information Networking and Applications (AINA).

[12]  Balachander Krishnamurthy,et al.  Flash crowds and denial of service attacks: characterization and implications for CDNs and web sites , 2002, WWW.

[13]  Robbert van Renesse,et al.  An analysis of Facebook photo caching , 2013, SOSP.

[14]  Rohit Rastogi,et al.  A new face to photo security of Facebook , 2013, 2013 Sixth International Conference on Contemporary Computing (IC3).

[15]  A. W. Roscoe Understanding Concurrent Systems , 2010, Texts in Computer Science.

[16]  Thamer Alhussain,et al.  Privacy and security issues in social networks: an evaluation of Facebook , 2013, ISDOC.

[17]  Shengchao Qin,et al.  Comparative modelling and verification of Pthreads and Dthreads , 2018, J. Softw. Evol. Process..

[18]  A. W. Roscoe,et al.  Using CSP to Detect Errors in the TMN Protocol , 1997, IEEE Trans. Software Eng..