Scaling Up IoT Stream Processing

Users create large numbers of IoT stream queries with data streams generated from various IoT devices. Current stream processing systems such as Storm and Flink are unable to support such large numbers of IoT stream queries efficiently, as their execution models cause a flurry of cache misses while processing the events of the queries. To solve this problem, we present a new group-aware execution model, which processes the events of IoT stream queries in a way that exploits the locality of data and code references, to reduce cache misses and improve system performance. The group-aware execution model leverages the fact that users create the groups of queries according to their interests or location contexts and that queries in the same group can share the same data and codes. We realize the group-aware execution model on MIST---a new stream processing system tailored for processing many IoT stream queries efficiently---to scale up the number of IoT queries that can be processed in a machine. Our preliminary evaluation shows that our group-aware execution model increases the number of queries that can be processed within a single machine up to 3.18X compared to the Flink-based execution model.

[1]  Dimitri P. Bertsekas,et al.  Data networks (2nd ed.) , 1992 .

[2]  Jay Kreps,et al.  Kafka : a Distributed Messaging System for Log Processing , 2011 .

[3]  Jignesh M. Patel,et al.  Storm@twitter , 2014, SIGMOD Conference.

[4]  Wei Hong,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Tag: a Tiny Aggregation Service for Ad-hoc Sensor Networks , 2022 .

[5]  Didier Stricker,et al.  Creating and benchmarking a new dataset for physical activity monitoring , 2012, PETRA '12.

[6]  Beng Chin Ooi,et al.  Streaming multiple aggregations using phantoms , 2010, The VLDB Journal.

[7]  Dimitri P. Bertsekas,et al.  Data Networks , 1986 .

[8]  Xing Xie,et al.  GeoLife: A Collaborative Social Networking Service among User, Location and Trajectory , 2010, IEEE Data Eng. Bull..

[9]  Lei Yang,et al.  Bringing IoT to Sports Analytics , 2017, NSDI.

[10]  Jennifer Widom,et al.  STREAM: the stanford stream data manager (demonstration description) , 2003, SIGMOD '03.

[11]  Beng Chin Ooi,et al.  Multiple aggregations over data streams , 2005, SIGMOD '05.

[12]  Frederick Reiss,et al.  TelegraphCQ: continuous dataflow processing , 2003, SIGMOD '03.

[13]  Wei Hong,et al.  TinyDB: an acquisitional query processing system for sensor networks , 2005, TODS.

[14]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[15]  Ranveer Chandra,et al.  FarmBeats: An IoT Platform for Data-Driven Agriculture , 2017, NSDI.

[16]  Elke A. Rundensteiner,et al.  State-slice: new paradigm of multi-query optimization of window-based stream queries , 2006, VLDB.

[17]  Abhirup Khanna,et al.  IoT based smart parking system , 2018, 2016 International Conference on Internet of Things and Applications (IOTA).

[18]  Samuel Madden,et al.  Continuously adaptive continuous queries over streams , 2002, SIGMOD '02.

[19]  Lukasz Golab,et al.  Multi-query optimization of sliding window aggregates by schedule synchronization , 2006, CIKM '06.

[20]  Rajeev Rastogi,et al.  Memory-constrained aggregate computation over data streams , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[21]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.

[22]  David E. Culler,et al.  SEDA: an architecture for well-conditioned, scalable internet services , 2001, SOSP.

[23]  Scott Shenker,et al.  Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.

[24]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[25]  Ratul Mahajan,et al.  Bolt: Data Management for Connected Homes , 2014, NSDI.