TourMiner: Effective and Efficient Clustering Big Mobile Social Data for Supporting Advanced Analytics Tools

Nowadays a great deal of attention is devoted to the issue of supporting big data analytics over big mobile social data. These data are generated by modern emerging social systems like Twitter, Facebook, Instagram, and so forth. Mining big mobile social data has been of great interest, as analyzing such data is critical for a wide spectrum of big data applications (e.g., smart cities). Among several proposals, clustering is a well-known solution for extracting interesting and actionable knowledge from massive amounts of big mobile (geo-located) social data. Inspired by this main thesis, this paper proposes an effective and efficient similarity-matrix-based algorithm for clustering big mobile social data, called TourMiner, which is specifically targeted to clustering trips extracted from tweets, in order to mine most popular tours. The main characteristic of TourMiner consists in applying clustering over a well-suited similarity matrix computed on top of trips.

[1]  Nicholas Jing Yuan,et al.  Making sense of trajectory data: A partition-and-summarization approach , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[2]  Nicholas Jing Yuan Mining Social and Urban Big Data , 2015, WWW.

[3]  Yi Huang,et al.  Recommending Venues Using Continuous Predictive Social Media Analytics , 2014, IEEE Internet Computing.

[4]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[5]  Ioannis Stavrakakis,et al.  Exploiting user interest similarity and social links for micro-blog forwarding in mobile opportunistic networks , 2014, Pervasive Mob. Comput..

[6]  Jeffrey D. Ullman,et al.  Big data: a research agenda , 2013, IDEAS '13.

[7]  Fabio Porto,et al.  A conceptual view on trajectories , 2008, Data Knowl. Eng..

[8]  Domenico Talia,et al.  Mining Popular Travel Routes from Social Network Geo-Tagged Data , 2015 .

[9]  Alfredo Cuzzocrea Analytics over Big Data: Exploring the Convergence of DataWarehousing, OLAP and Data-Intensive Cloud Infrastructures , 2013, 2013 IEEE 37th Annual Computer Software and Applications Conference.

[10]  Alfredo Cuzzocrea,et al.  Mining constrained frequent itemsets from distributed uncertain data , 2014, Future Gener. Comput. Syst..

[11]  Gloria Bordogna,et al.  Clustering Geo-tagged Tweets for Advanced Big Data Analytics , 2016, 2016 IEEE International Congress on Big Data (BigData Congress).

[12]  Albert Y. Zomaya,et al.  OmniSuggest: A Ubiquitous Cloud-Based Context-Aware Recommendation System for Mobile Social Networks , 2014, IEEE Transactions on Services Computing.

[13]  Bo Deng,et al.  Community structure mining in big data social media networks with MapReduce , 2015, Cluster Computing.

[14]  Alfredo Cuzzocrea,et al.  Analytical Synopses for Approximate Query Answering in OLAP Environments , 2004, DEXA.

[15]  Walid G. Aref,et al.  Analysis of Multi-Dimensional Space-Filling Curves , 2003, GeoInformatica.

[16]  Guoren Wang,et al.  Camel: A Journey Group T-Pattern Mining System Based on Instagram Trajectory Data , 2014, DASFAA.

[17]  Vania Bogorny,et al.  A model for enriching trajectories with semantic geographical information , 2007, GIS.

[18]  Giuseppe Psaila,et al.  An Innovative Framework for Effectively and Efficiently Supporting Big Data Analytics over Geo-Located Mobile Social Media , 2016, IDEAS.

[19]  Yorick Wilks,et al.  Faceted search, social networking and interactive semantics , 2013, World Wide Web.

[20]  Jiebo Luo,et al.  Diversified Trajectory Pattern Ranking in Geo-tagged Social Media , 2011, SDM.

[21]  Lei Huang,et al.  Spatial-temporal characterization of synchrophasor measurement systems — A big data approach for smart grid system situational awareness , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[22]  Alfredo Cuzzocrea,et al.  On Managing Very Large Sensor-Network Data Using Bigtable , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[23]  Stefano Spaccapietra,et al.  SeMiTri: a framework for semantic annotation of heterogeneous trajectories , 2011, EDBT/ICDT '11.

[24]  Jack B. Dennis,et al.  Virtual memory, processes, and sharing in Multics , 1967, SOSP 1967.