FogStore: A Geo-Distributed Key-Value Store Guaranteeing Low Latency for Strongly Consistent Access

We design Fogstore, a key-value store for event-based systems, that exploits the concept of relevance to guarantee low-latency access to relevant data with strong consistency guarantees, while providing tolerance from geographically correlated failures. Distributed event-based processing pipelines are envisioned to utilize the resources of densely geo-distributed infrastructures for low-latency responses - enabling real-time applications. Increasing complexity of such applications results in higher dependence on state, which has driven the incorporation of state-management as a core functionality of contemporary stream processing engines a la Apache Flink and Samza. Processing components executing under the same context (like location) often produce information that may be relevant to others, thereby necessitating shared state and an out-of-band globally-accessible data-store. Efficient access to application state is critical for overall performance, thus centralized data-stores are not a viable option due to the high-latency of network traversals. On the other hand, a highly geo-distributed datastore with low-latency implemented with current key-value stores would necessitate degrading client expectation of consistency as per the PACELC theorem. In this paper we exploit the notion of contextual relevance of events (data) in situation-awareness applications - and offer differential consistency guarantees for clients based on their context. We highlight important systems concerns that may arise with a highly geo-distributed system and show how Fogstore's design tackles them. We present, in detail, a prototype implementation of Fogstore's mechanisms on Apache Cassandra and a performance evaluation. Our evaluations show that Fogstore is able to achieve the throughput of eventually consistent configurations while serving data with strong consistency to the contextually relevant clients.

[1]  Bastien Confais,et al.  Performance Analysis of Object Store Systems in a Fog and Edge Computing Infrastructure , 2017, Trans. Large Scale Data Knowl. Centered Syst..

[2]  Toyokazu Akiyama,et al.  Scalable and Locality-Aware Distributed Topic-Based Pub/Sub Messaging for IoT , 2014, 2015 IEEE Global Communications Conference (GLOBECOM).

[3]  Stefano Secci,et al.  Latency versus survivability in geo-distributed data center design , 2014, 2014 IEEE Global Communications Conference.

[4]  Noureddine Hamdi,et al.  Spatial data extension for Cassandra NoSQL database , 2016, Journal of Big Data.

[5]  Vincenzo Grassi,et al.  On QoS-aware scheduling of data stream applications over fog computing infrastructures , 2015, 2015 IEEE Symposium on Computers and Communication (ISCC).

[6]  María S. Pérez-Hernández,et al.  Exploring Shared State in Key-Value Store for Window-Based Multi-pattern Streaming Analytics , 2017, 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[7]  Ian Rae,et al.  F1: A Distributed SQL Database That Scales , 2013, Proc. VLDB Endow..

[8]  Martin Dräxler,et al.  MaxiNet: Distributed emulation of software-defined networks , 2014, 2014 IFIP Networking Conference.

[9]  Enrique Saurez,et al.  Incremental deployment and migration of geo-distributed situation awareness applications in the fog , 2016, DEBS.

[10]  Hans Sagan,et al.  Hilbert’s Space-Filling Curve , 1994 .

[11]  Toyokazu Akiyama,et al.  Scalable and Locality-Aware Distributed Topic-Based Pub/Sub Messaging for IoT , 2014, GLOBECOM 2014.

[12]  Subhajit Sidhanta,et al.  Adaptable SLA-Aware Consistency Tuning for Quorum-Replicated Datastores , 2017, IEEE Transactions on Big Data.

[13]  Divyakant Agrawal,et al.  Global-Scale Placement of Transactional Data Stores , 2018, EDBT.

[14]  Indranil Gupta,et al.  Stateful Scalable Stream Processing at LinkedIn , 2017, Proc. VLDB Endow..

[15]  Hua Fan,et al.  Fine-tuning the consistency-latency trade-off in quorum-replicated distributed storage systems , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[16]  Lewis Tseng,et al.  Characterizing and Adapting the Consistency-Latency Tradeoff in Distributed Key-Value Stores , 2015, ACM Trans. Auton. Adapt. Syst..

[17]  David Lillethun,et al.  Mobile fog: a programming model for large-scale applications on the internet of things , 2013, MCC '13.

[18]  Raja Lavanya,et al.  Fog Computing and Its Role in the Internet of Things , 2019, Advances in Computer and Electrical Engineering.

[19]  Lorenzo Affetti Consistent Stream Processing: Doctoral Symposium , 2017, DEBS.

[20]  Seif Haridi,et al.  Apache Flink™: Stream and Batch Processing in a Single Engine , 2015, IEEE Data Eng. Bull..