Roda: A Flexible Framework for Real-Time On-demand Data Aggregation

It is critical to aggregate data from multiple sources to support real-time decision making in several fields, such as anti-telecommunications fraud detection. However, as data sources are distributed, heterogeneous and autonomous, it is challenging to ensure that data aggregation satisfies the requirements of real-time, on-demand and flexibility. In this paper, we propose a real-time on-demand data aggregation (Roda) framework, which is designed to be flexible enough to support the dynamic joining of new data sources, the immediate updating of aggregation rules and the quick adaptation to data velocity. We implement a prototype of Roda based on Kafka and Docker using the overlay network technique. To evaluate the effectiveness and performance of Roda, we conduct a series of experiments based on real trace data. The experiment results show that Roda can guarantee data aggregation latency at the millisecond scale, easily achieving our design goals.

[1]  Andreas Wimmer,et al.  Flexible value structures in banking , 2004, CACM.

[2]  Yuanan Liu,et al.  GEM: An analytic geometrical approach to fast event matching for multi-dimensional content-based publish/subscribe services , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[3]  Kian-Lee Tan,et al.  An Efficient Publish/Subscribe Index for ECommerce Databases , 2014, Proc. VLDB Endow..

[4]  David Eyers,et al.  A capability-based access control architecture for multi-domain publish/subscribe systems , 2006, International Symposium on Applications and the Internet (SAINT'06).

[5]  Mohammad Mehdi Sepehri,et al.  A data mining framework for detecting subscription fraud in telecommunication , 2011, Eng. Appl. Artif. Intell..

[6]  Minglu Li,et al.  REIN: A fast event matching approach for content-based publish/subscribe systems , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[7]  Minglu Li,et al.  H-Tree: An Efficient Index Structurefor Event Matching in Content-BasedPublish/Subscribe Systems , 2015, IEEE Transactions on Parallel and Distributed Systems.

[8]  James Llinas,et al.  An introduction to multi-sensor data fusion , 1998, ISCAS '98. Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (Cat. No.98CH36187).

[9]  Tomasz Janowski,et al.  Interoperability in Big, Open, and Linked Data--Organizational Maturity, Capabilities, and Data Portfolios , 2014, Computer.

[10]  Kurt Rothermel,et al.  PLEROMA: a SDN-based high performance publish/subscribe middleware , 2014, Middleware.

[11]  Hans-Arno Jacobsen,et al.  BE-tree: an index structure to efficiently match boolean expressions over high-dimensional discrete space , 2011, SIGMOD '11.

[12]  Jie Wu,et al.  Towards Approximate Event Processing in a Large-Scale Content-Based Network , 2011, 2011 31st International Conference on Distributed Computing Systems.

[13]  Alexander L. Wolf,et al.  Forwarding in a content-based network , 2003, SIGCOMM '03.

[14]  Alexander L. Wolf,et al.  Security issues and requirements for Internet-scale publish-subscribe systems , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[15]  Luis Guijarro,et al.  Semantic interoperability in eGovernment initiatives , 2009, Comput. Stand. Interfaces.

[16]  Meikang Qiu,et al.  A User-Centric Data Protection Method for Cloud Storage Based on Invertible DWT , 2021, IEEE Transactions on Cloud Computing.

[17]  Sasu Tarkoma,et al.  Security Design for an Inter-Domain Publish/Subscribe Architecture , 2011, Future Internet Assembly.

[18]  Badrish Chandramouli,et al.  ProSem: scalable wide-area publish/subscribe , 2008, SIGMOD Conference.

[19]  Minglu Li,et al.  Adjusting Matching Algorithm to Adapt to Workload Fluctuations in Content-based Publish/Subscribe Systems , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[20]  Minglu Li,et al.  PhSIH: A Lightweight Parallelization of Event Matching in Content-based Pub/Sub Systems , 2019, ICPP.

[21]  Lauri I. W. Pesonen,et al.  Encryption-enforced access control in dynamic multi-domain publish/subscribe networks , 2007, DEBS '07.

[22]  David M. Eyers,et al.  Disclosure control in multi-domain publish/subscribe systems , 2011, DEBS '11.

[23]  Minglu Li,et al.  A fast and anti-matchability matching algorithm for content-based publish/subscribe systems , 2019, Comput. Networks.