Approximate Range Emptiness in Constant Time for IoT Data Streams over Sliding Windows

Facilitating real-time query over massive IoT data streams becomes increasingly important nowadays, for that it can boost the performances of real-time network services significantly. Let $\delta=e_1,e_2,\cdots,e_t,\cdots$ represent an IoT data stream, where each element $e_t$ arrives at time point $t$. In this paper, we consider the problem of how to support fast range emptiness querying over an IoT data stream $\delta$ in sliding window model with a space-efficient data structure, and we denote this problem as the \textbf{$(\varepsilon,L)$-$\text{ARE}$-problem}. To be more formally, subjected to the constraint of one-pass scan of stream $\delta$, the main task of the \textbf{$(\varepsilon,L)$-$\text{ARE}$-problem} is to design a space-efficient data structure that is capable of always representing $W(t,n)$, which are the $n$ latest elements of stream $\delta$ until time point $t$ (i.e., $W(t,n)=e_{{\rm{max}}\{1,t-n+1\}},\cdots,e_{t-1},e_t$), and quickly answering an emptiness query of the form "$W(t,n)\cap I=\emptyset?$", with a false positive rate no larger than $\varepsilon$, for any query interval $I$ of length up to ℒ. We design a space-efficient data structure \textbf{D} to solve the \textbf{$(\varepsilon,L)$-$\text{ARE}$-problem} and prove that \textbf{D} has constant time cost for querying an interval, inserting a stream element and evicting outdated elements. The efficiency is demonstrated with extensive simulation results as well.

[1]  Yusheng Ji,et al.  An Approximate Duplicate-Elimination in RFID Data Streams Based on d-Left Time Bloom Filter , 2014, APWeb.

[2]  Shusen Yang,et al.  IoT Stream Processing and Analytics in the Fog , 2017, IEEE Communications Magazine.

[3]  Athanasios V. Vasilakos,et al.  The role of big data analytics in Internet of Things , 2017, Comput. Networks.

[4]  Allan Grønlund Jørgensen,et al.  Approximate Range Emptiness in Constant Time and Optimal Space , 2014, SODA.

[5]  Yusheng Ji,et al.  Spatial Intelligence toward Trustworthy Vehicular IoT , 2018, IEEE Communications Magazine.

[6]  Yong Guan,et al.  Detecting Click Fraud in Pay-Per-Click Streams of Online Advertising Networks , 2008, 2008 The 28th International Conference on Distributed Computing Systems.

[7]  George Varghese,et al.  An Improved Construction for Counting Bloom Filters , 2006, ESA.

[8]  Zhou Jun,et al.  Approximately Filtering Redundant Data for Uncertain RFID Data Streams , 2017, 2017 18th IEEE International Conference on Mobile Data Management (MDM).

[9]  Berthold Vöcking,et al.  How asymmetry helps load balancing , 1999, JACM.

[10]  Athanasios V. Vasilakos,et al.  When things matter: A survey on data-centric internet of things , 2016, J. Netw. Comput. Appl..

[11]  Xiao Zheng,et al.  Ant Colony System Based Algorithm for QoS-Aware Web Service Selection , 2007, GSEM.

[12]  Ruidong Li,et al.  A Distributed Publisher-Driven Secure Data Sharing Scheme for Information-Centric IoT , 2017, IEEE Internet of Things Journal.

[13]  Toshitaka Tsuda,et al.  Data Driven Cyber-Physical System for Landslide Detection , 2019, Mob. Networks Appl..

[14]  Richard K. Lomotey,et al.  Wearable IoT data stream traceability in a distributed health information system , 2017, Pervasive Mob. Comput..

[15]  Chinya V. Ravishankar,et al.  Inferential time-decaying Bloom filters , 2013, EDBT '13.

[16]  Peng Li,et al.  VSMURF: A Novel Sliding Window Cleaning Algorithm for RFID Networks , 2017, J. Sensors.

[17]  Rasmus Pagh,et al.  How to Approximate a Set without Knowing Its Size in Advance , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[18]  Moni Naor,et al.  Tight Bounds for Sliding Bloom Filters , 2013, Algorithmica.

[19]  Anders Lindgren,et al.  Demo: Experimental Feasibility Study of CCN-lite on Contiki Motes for IoT Data Streams , 2016, ICN.

[20]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..

[21]  Xianfu Chen,et al.  Near-Optimal Data Structure for Approximate Range Emptiness Problem in Information-Centric Internet of Things , 2019, IEEE Access.

[22]  Donald Kossmann,et al.  Adaptive Range Filters for Cold Data: Avoiding Trips to Siberia , 2013, Proc. VLDB Endow..

[23]  Salvatore Pontarelli,et al.  Improving counting Bloom filter performance with fingerprints , 2016, Inf. Process. Lett..

[24]  Jong Hyuk Park,et al.  An effective handling of secure data stream in IoT , 2017, Appl. Soft Comput..

[25]  Latifur Khan,et al.  IoT Big Data Stream Mining , 2016, KDD.

[26]  Shalini Batra,et al.  Bloom filter based optimization scheme for massive data handling in IoT environment , 2017, Future Gener. Comput. Syst..

[27]  Wonjun Lee,et al.  SCBF: Exploiting a Collision for Authentication in Backscatter Networks , 2017, IEEE Communications Letters.

[28]  Yong Guan,et al.  Near-optimal approximate membership query over time-decaying windows , 2013, 2013 Proceedings IEEE INFOCOM.