Adaptive Filtering of Visual Content in Distributed Publish/Subscribe Systems

Classic event matching techniques in large-scale Content-based Publish/Subscribe Systems mostly rely on predicate indexing or tree-based mechanisms for fast subscription evaluation. In the context of visual analytics, such techniques are limited in supporting subscriptions requiring expensive filtering operators over unstructured event types (i.e. images and videos). In this work, user subscriptions over visual content are answered as conjunctions of commutative Boolean filters where each filter is associated with a single high-level semantic concept that may be shared across multiple subscriptions. The shared-filter ordering problem has been previously studied in centralized data stream management systems; prior works propose approximation algorithms that achieve near-optimal cost reductions in the evaluation of overlapping queries. However, in a distributed publish/subscribe setting, even an optimal ordering of filter evaluations at brokers with high workloads can create bottlenecks and waste downstream resources. We present a distributed greedy algorithm that leverages existing routing methodologies to order and distribute the execution of filters across brokers on various dissemination paths. Experiments with several pub/sub workloads show 50% to 70% decrease in event latencies and noticeable improvements in resource utilization across the overlay.

[1]  Dennis Shasha,et al.  Filtering algorithms and implementation for very fast publish/subscribe systems , 2001, SIGMOD '01.

[2]  Matei Zaharia,et al.  NoScope: Optimizing Deep CNN-Based Queries over Video Streams at Scale , 2017, Proc. VLDB Endow..

[3]  Hao Yang,et al.  Near-optimal algorithms for shared filter evaluation in data stream systems , 2008, SIGMOD Conference.

[4]  Nalini Venkatasubramanian,et al.  CCD: A Distributed Publish/Subscribe Framework for Rich Content Formats , 2012, IEEE Transactions on Parallel and Distributed Systems.

[5]  Zongpeng Li,et al.  Network Latency Estimation for Personal Devices: A Matrix Completion Approach , 2017, IEEE/ACM Transactions on Networking.

[6]  Aakanksha Chowdhery,et al.  Accelerating Machine Learning Inference with Probabilistic Predicates , 2018, SIGMOD Conference.

[7]  Maarten van Steen,et al.  The hidden pub/sub of spotify: (industry article) , 2013, DEBS '13.

[8]  Jaswinder Pal Singh,et al.  Analysis and algorithms for content-based event matching , 2005, 25th IEEE International Conference on Distributed Computing Systems Workshops.

[9]  Kaiwen Zhang,et al.  Distributed event aggregation for content-based publish/subscribe systems , 2014, DEBS '14.

[10]  David S. Rosenblum,et al.  Design and evaluation of a wide-area event notification service , 2001, TOCS.

[11]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[12]  Marcos K. Aguilera,et al.  Matching events in a content-based subscription system , 1999, PODC '99.

[13]  Jennifer Widom,et al.  Optimization of continuous queries with shared expensive filters , 2007, PODS.