Towards mobility-based clustering

Identifying hot spots of moving vehicles in an urban area is essential to many smart city applications. The practical research on hot spots in smart city presents many unique features, such as highly mobile environments, supremely limited size of sample objects, and the non-uniform, biased samples. All these features have raised new challenges that make the traditional density-based clustering algorithms fail to capture the real clustering property of objects, making the results less meaningful. In this paper we propose a novel, non-density-based approach called mobility-based clustering. The key idea is that sample objects are employed as "sensors" to perceive the vehicle crowdedness in nearby areas using their instant mobility, rather than the "object representatives". As such the mobility of samples is naturally incorporated. Several key factors beyond the vehicle crowdedness have been identified and techniques to compensate these effects are proposed. We evaluate the performance of mobility-based clustering based on real traffic situations. Experimental results show that using 0.3% of vehicles as the samples, mobility-based clustering can accurately identify hot spots which can hardly be obtained by the latest representative algorithm UMicro.

[1]  Jian Pei,et al.  Query answering techniques on uncertain and probabilistic data: tutorial summary , 2008, SIGMOD Conference.

[2]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[3]  Li Tu,et al.  Density-based clustering for real-time stream data , 2007, KDD '07.

[4]  David Wai-Lok Cheung,et al.  Clustering Uncertain Data Using Voronoi Diagrams , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[5]  Shaojie Tang,et al.  Canopy closure estimates with GreenOrbs: sustainable sensing in the forest , 2009, SenSys '09.

[6]  Jaakko Hollmén,et al.  Spatio-temporal Road Condition Forecasting with Markov Chains and Artificial Neural Networks , 2008, HAIS.

[7]  Sergio Greco,et al.  A Hierarchical Algorithm for Clustering Uncertain Data via an Information-Theoretic Approach , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[8]  Yunhao Liu,et al.  Mining Frequent Trajectory Patterns for Activity Monitoring Using Radio Frequency Tag Arrays , 2012, IEEE Transactions on Parallel and Distributed Systems.

[9]  Hans-Peter Kriegel,et al.  Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications , 1998, Data Mining and Knowledge Discovery.

[10]  Jiawei Han,et al.  CLARANS: A Method for Clustering Objects for Spatial Data Mining , 2002, IEEE Trans. Knowl. Data Eng..

[11]  Mingyan Liu,et al.  Surface street traffic estimation , 2007, MobiSys '07.

[12]  Yifan Li,et al.  Clustering moving objects , 2004, KDD.

[13]  Padhraic Smyth,et al.  Trajectory clustering with mixtures of regression models , 1999, KDD '99.

[14]  Samir Khuller,et al.  Achieving anonymity via clustering , 2006, PODS '06.

[15]  Philip S. Yu,et al.  Evolutionary Clustering by Hierarchical Dirichlet Process with Hidden Markov State , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[16]  Vipin Kumar,et al.  Emerging scientific applications in data mining , 2002, CACM.

[17]  Hans-Peter Kriegel,et al.  Statistical Density Prediction in Traffic Networks , 2008, SDM.

[18]  Philip S. Yu,et al.  A Framework for Clustering Uncertain Data Streams , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[19]  Jae-Gil Lee,et al.  Traffic Density-Based Discovery of Hot Routes in Road Networks , 2007, SSTD.

[20]  Benjamin Coifman Identifying the Onset of Congestion Rapidly with Existing Traffic Detectors , 1999 .

[21]  Deepayan Chakrabarti,et al.  Evolutionary clustering , 2006, KDD '06.

[22]  Lionel M. Ni,et al.  SEER: Metropolitan-Scale Traffic Perception Based on Lossy Sensory Data , 2009, IEEE INFOCOM 2009.

[23]  Cyrus Shahabi,et al.  Robust Time-Referenced Segmentation of Moving Object Trajectories , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[24]  Tao Li,et al.  HIREL: An Incremental Clustering Algorithm for Relational Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[25]  Jeffrey Xu Yu,et al.  Sliding-window top-k queries on uncertain streams , 2008, The VLDB Journal.