TorqueDB: Distributed Querying of Time-Series Data from Edge-local Storage

The rapid growth in edge computing devices as part of Internet of Things (IoT) allows real-time access to time-series data from 1000’s of sensors. Such observations are often queried to optimize the health of the infrastructure. Recently, edge storage systems allow us to retain data on the edge rather than moving them centrally to the cloud. However, such systems do not support flexible querying over the data spread across 10–100’s of devices. There is also a lack of distributed time-series databases that can run on the edge devices. Here, we propose TorqueDB, a distributed query engine over time-series data that operates on edge and fog resources. TorqueDB leverages our prior work on ElfStore, a distributed edge-local file store, and InfluxDB, a time-series database, to enable temporal queries to be decomposed and executed across multiple fog and edge devices. Interestingly, we move data into InfluxDB on-demand while retaining the durable data within ElfStore for use by other applications. We also design a cost model that maximizes parallel movement and execution of the queries across resources, and utilizes caching. Our experiments on a real edge, fog and cloud deployment show that TorqueDB performs comparable to InfluxDB on a cloud VM for a smart city query workload, but without the associated monetary costs.

[1]  Tomio Kamada,et al.  Distributed Key-Value Storage for Edge Computing and Its Explicit Data Distribution Method , 2020, IEICE Trans. Commun..

[2]  Yogesh L. Simmhan,et al.  Big Data and Fog Computing , 2017, Encyclopedia of Big Data Technologies.

[3]  Shrideep Pallickara,et al.  HERMES: Federating Fog and Cloud Domains to Support Query Evaluations in Continuous Sensing Environments , 2017, IEEE Cloud Computing.

[4]  Andreas Heuer,et al.  Rewriting Complex Queries from Cloud to Fog under Capability Constraints to Protect the Users' Privacy , 2017, Open J. Internet Things.

[5]  Mikael Martinviita,et al.  Time series database in Industrial IoT and its testing tool , 2018 .

[6]  Umakishore Ramachandran,et al.  DataFog: Towards a Holistic Data Management Platform for the IoT Age at the Network Edge , 2018, HotEdge.

[7]  Rui Liu,et al.  Benchmark Time Series Database with IoTDB-Benchmark for IoT Scenarios , 2019, ArXiv.

[8]  StonebrakerMichael,et al.  The Seattle Report on Database Research , 2020, SIGMOD 2020.

[9]  Amit P. Sheth,et al.  On Using the Intelligent Edge for IoT Analytics , 2017, IEEE Intelligent Systems.

[10]  Yogesh L. Simmhan,et al.  ElfStore: A Resilient Data Storage Service for Federated Edge and Fog Resources , 2019, 2019 IEEE International Conference on Web Services (ICWS).

[11]  Samarjit Chakraborty,et al.  OS-Aware Automotive Controller Design Using Non-Uniform Sampling , 2018, ACM Trans. Cyber Phys. Syst..

[12]  Zhangbing Zhou,et al.  Periodic Query Optimization Leveraging Popularity-Based Caching in Wireless Sensor Networks for Industrial IoT Applications , 2015, Mob. Networks Appl..

[13]  Yogesh L. Simmhan,et al.  Characterizing application scheduling on edge, fog, and cloud computing resources , 2019, Softw. Pract. Exp..

[14]  Peter R. Pietzuch,et al.  Distributed complex event processing with query rewriting , 2009, DEBS '09.

[15]  Bo Li,et al.  LSTM-Based Analysis of Industrial IoT Equipment , 2018, IEEE Access.

[16]  Marios D. Dikaiakos,et al.  StreamSight: A Query-Driven Framework for Streaming Analytics in Edge Computing , 2018, 2018 IEEE/ACM 11th International Conference on Utility and Cloud Computing (UCC).

[17]  Yogesh L. Simmhan,et al.  Distributed Scheduling of Event Analytics across Edge and Cloud , 2016, ACM Trans. Cyber Phys. Syst..