Location-Aware Top-k Term Publish/Subscribe

Massive amount of data that contain spatial, textual, and temporal information are being generated at a high scale. These spatio-temporal documents cover a wide range of topics in local area. Users are interested in receiving local popular terms from spatio-temporal documents published with a specified region. We consider the Top-k Spatial-Temporal Term (ST2) Subscription. Given an ST2 subscription, we continuously maintain up-to-date top-k most popular terms over a stream of spatio-temporal documents. The ST2 subscription takes into account both frequency and recency of a term generated from spatio-temporal document streams in evaluating its popularity. We propose an efficient solution to process a large number of ST2 subscriptions over a stream of spatio-temporal documents. The performance of processing ST2 subscriptions is studied in extensive experiments based on two real spatio-temporal datasets.

[1]  Hanan Samet,et al.  TwitterStand: news in tweets , 2009, GIS.

[2]  Hila Becker,et al.  Beyond Trending Topics: Real-World Event Identification on Twitter , 2011, ICWSM.

[3]  Panos Kalnis,et al.  Discovery of Path Nearby Clusters in Spatial Networks , 2015, IEEE Transactions on Knowledge and Data Engineering.

[4]  Kwan-Liu Ma,et al.  Breaking news on twitter , 2012, CHI.

[5]  Qi He,et al.  Bursty Feature Representation for Clustering Text Streams , 2007, SDM.

[6]  Ting Wang,et al.  Efficient Filtering Algorithms for Location-Aware Publish/Subscribe , 2015, IEEE Transactions on Knowledge and Data Engineering.

[7]  Hua Lu,et al.  Planning unobstructed paths in traffic-aware spatial networks , 2015, GeoInformatica.

[8]  Michael J. Franklin,et al.  Efficient Filtering of XML Documents for Selective Dissemination of Information , 2000, VLDB.

[9]  Xuemin Lin,et al.  SKYPE: Top-k Spatial-keyword Publish/Subscribe Over Sliding Window , 2016, Proc. VLDB Endow..

[10]  Panos Kalnis,et al.  Personalized trajectory matching in spatial networks , 2014, The VLDB Journal.

[11]  Panos Kalnis,et al.  Collective Travel Planning in Spatial Networks , 2016, IEEE Transactions on Knowledge and Data Engineering.

[12]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[13]  Anthony K. H. Tung,et al.  Scalable top-k spatial keyword search , 2013, EDBT '13.

[14]  Karl Aberer,et al.  Top-k/w publish/subscribe: finding k most relevant publications in sliding time window w , 2008, DEBS.

[15]  Yang Wang,et al.  Location-aware publish/subscribe , 2013, KDD.

[16]  Gianni Amati,et al.  Survival analysis for freshness in microblogging search , 2012, CIKM.

[17]  Rizal Setya Perdana What is Twitter , 2013 .

[18]  Kian-Lee Tan,et al.  Location-Aware Pub/Sub System: When Continuous Moving Queries Meet Dynamic Event Streams , 2015, SIGMOD Conference.

[19]  Xiangliang Zhang,et al.  Efficient task assignment in spatial crowdsourcing with worker and task privacy protection , 2018, GeoInformatica.

[20]  Vagelis Hristidis,et al.  Efficient Computation of Top-k Frequent Terms over Spatio-temporal Ranges , 2017, SIGMOD Conference.

[21]  Miles Efron,et al.  Estimation methods for ranking recent information , 2011, SIGIR.

[22]  Yan Cui,et al.  An efficient query indexing mechanism for filtering geo-textual data , 2014 .

[23]  Walid G. Aref,et al.  SEA-CNN: scalable processing of continuous k-nearest neighbor queries in spatio-temporal databases , 2005, 21st International Conference on Data Engineering (ICDE'05).

[24]  Gao Cong,et al.  Diversity-Aware Top-k Publish/Subscribe for Text Stream , 2015, SIGMOD Conference.

[25]  Marcus Fontoura,et al.  Top-k Publish-Subscribe for Social Annotation of News , 2013, Proc. VLDB Endow..

[26]  Shi Zhong,et al.  Efficient streaming text clustering , 2005, Neural Networks.

[27]  Karl Aberer,et al.  The gist of everything new: personalized top-k processing over web 2.0 streams , 2010, CIKM.

[28]  Karen Rose,et al.  What is Twitter , 2009 .

[29]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[30]  Beihong Jin,et al.  Efficient Top-k Subscription Matching for Location-Aware Publish/Subscribe , 2015, SSTD.

[31]  Lei Chen,et al.  Online Minimum Matching in Real-Time Spatial Data: Experiments and Analysis , 2016, Proc. VLDB Endow..

[32]  Xuemin Lin,et al.  AP-Tree: Efficiently support continuous spatial-keyword queries over stream , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[33]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[34]  Xuemin Lin,et al.  Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search , 2016, IEEE Transactions on Knowledge and Data Engineering.

[35]  Yanlei Diao,et al.  YFilter: efficient and scalable filtering of XML documents , 2002, Proceedings 18th International Conference on Data Engineering.

[36]  Christian S. Jensen,et al.  Scalable top-k spatio-temporal term querying , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[37]  Christian S. Jensen,et al.  Spatial Keyword Query Processing: An Experimental Evaluation , 2013, Proc. VLDB Endow..

[38]  Yiqun Liu,et al.  A location-aware publish/subscribe framework for parameterized spatio-textual subscriptions , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[39]  Wael Khreich,et al.  A Survey of Techniques for Event Detection in Twitter , 2015, Comput. Intell..

[40]  Yue Xu,et al.  Time-aware topic recommendation based on micro-blogs , 2012, CIKM.

[41]  Jieping Ye,et al.  Flexible Online Task Assignment in Real-Time Spatial Data , 2017, Proc. VLDB Endow..

[42]  Yan Cui,et al.  SOPS: A System for Efficient Processing of Spatial-Keyword Publish/Subscribe , 2014, Proc. VLDB Endow..

[43]  Kian-Lee Tan,et al.  An Efficient Publish/Subscribe Index for ECommerce Databases , 2014, Proc. VLDB Endow..

[44]  Kian-Lee Tan,et al.  Temporal Spatial-Keyword Top-k publish/subscribe , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[45]  Vaibhav Muddebihalkar,et al.  Searching Trajectories by Regions of Interest , 2018 .

[46]  Lei Chen,et al.  Online mobile Micro-Task Allocation in spatial crowdsourcing , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[47]  Divyakant Agrawal,et al.  Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.

[48]  Philip S. Yu,et al.  A Framework for Clustering Massive Text and Categorical Data Streams , 2006, SDM.

[49]  W. Bruce Croft,et al.  Time-based language models , 2003, CIKM '03.

[50]  Gao Cong,et al.  Topic Exploration in Spatio-Temporal Document Collections , 2016, SIGMOD Conference.

[51]  Guoliang Li,et al.  A Cost-based Method for Location-Aware Publish/Subscribe Services , 2015, CIKM.

[52]  Shunzhi Zhu,et al.  Location-Based Top-k Term Querying over Sliding Window , 2017, WISE.