A Robust Missing Data-Recovering Technique for Mobility Data Mining

ABSTRACT Based on location information, users’ mobility profile building is the main task for making different useful systems such as early warning system, next destination and route prediction, tourist guide, mobile users’ behavior-aware applications, and potential friend recommendation. For mobility profile building, frequent trajectory patterns are required. The trajectory building is based on significant location extraction and the user’s actual movement prediction. Previous works have focused on significant places extraction without considering the change in GSM (global system for mobile communication) network and is based on complete data analysis. Since network operators change the GSM network periodically, there are possibilities of missing values and outliers. These missing values and outliers must be addressed to ensure actual mobility and for the efficient extraction of significant places, which are the basis for users’ trajectory building. In this paper, we propose a methodology to convert geo-coordinates into semantic tags and we also purposed a clustering methodology for recovering missing values and outlier detection. Experimental results prove the efficiency and effectiveness of the proposed scheme.

[1]  Enhong Chen,et al.  Precise Location Acquisition of Mobility Data Using Cell-id , 2012, ArXiv.

[2]  Hojung Cha,et al.  LifeMap: A Smartphone-Based Context Provider for Location-Based Services , 2011, IEEE Pervasive Computing.

[3]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[4]  John B Carlin,et al.  Multiple imputation for missing data: fully conditional specification versus multivariate normal imputation. , 2010, American journal of epidemiology.

[5]  Craig K. Enders,et al.  An introduction to modern missing data analyses. , 2010, Journal of school psychology.

[6]  Michael G. Kenward,et al.  Handling missing values in cost effectiveness analyses that use data from cluster randomized trials , 2012, 1206.6070.

[7]  Xing Xie,et al.  Discovering regions of different functions in a city using human mobility and POIs , 2012, KDD.

[8]  Nitesh V. Chawla,et al.  Link Prediction and Recommendation across Heterogeneous Social Networks , 2012, 2012 IEEE 12th International Conference on Data Mining.

[9]  Stefano Spaccapietra,et al.  Semantic trajectories: Mobility data computation and annotation , 2013, TIST.

[10]  Carlos E. Palau,et al.  Quake detection system using smartphone-based wireless sensor network for early warning , 2014, 2014 IEEE International Conference on Pervasive Computing and Communication Workshops (PERCOM WORKSHOPS).

[11]  Dario Pacciarelli,et al.  Bi-objective conflict detection and resolution in railway traffic management , 2012 .

[12]  Le Gruenwald,et al.  DEMS: a data mining based technique to handle missing data in mobile sensor network applications , 2010, DMSN '10.

[13]  Azzedine Boukerche,et al.  A Sequential Patterns Data Mining Approach Towards Vehicular Route Prediction in VANETs , 2013, Mobile Networks and Applications.

[14]  Li Li,et al.  Efficient missing data imputing for traffic flow by considering temporal and spatial dependence , 2013 .

[15]  Guangdong Feng,et al.  A Tensor Based Method for Missing Traffic Data Completion , 2013 .

[16]  Danya Yao,et al.  Missing data imputation for traffic flow based on improved local least squares , 2012 .

[17]  Wei Jiang,et al.  Interest-driven private friend recommendation , 2013, Knowledge and Information Systems.

[18]  Sanjay Tyagi,et al.  AN ALGORITHMIC APPROACH TO DATA PREPROCESSING IN WEB USAGE MINING , 2010 .

[19]  Hong Yan,et al.  A Bicluster-Based Bayesian Principal Component Analysis Method for Microarray Missing Value Estimation , 2014, IEEE Journal of Biomedical and Health Informatics.

[20]  Tapani Raiko,et al.  Tkk Reports in Information and Computer Science Practical Approaches to Principal Component Analysis in the Presence of Missing Values Tkk Reports in Information and Computer Science Practical Approaches to Principal Component Analysis in the Presence of Missing Values , 2022 .

[21]  Hojung Cha,et al.  Automatically characterizing places with opportunistic crowdsensing using smartphones , 2012, UbiComp.

[22]  Chiung-Wen Chang,et al.  A functional data approach to missing value imputation and outlier detection for traffic flow data , 2013 .

[23]  Christian Trefftz,et al.  Community finding within the community set space , 2013, SNAKDD '13.

[24]  Julie Josse,et al.  Multiple imputation for continuous variables using a Bayesian principal component analysis† , 2014, 1401.5747.

[25]  Julie S. Ivy,et al.  Individualizing and optimizing the use of early warning scores in acute medical care for deteriorating hospitalized patients. , 2015, Resuscitation.

[26]  John Krumm,et al.  From destination prediction to route prediction , 2013, J. Locat. Based Serv..

[27]  Eric Horvitz,et al.  Some help on the way: opportunistic routing under uncertainty , 2012, UbiComp.

[28]  M. Storey,et al.  Advances in on-line drinking water quality monitoring and early warning systems. , 2011, Water research.

[29]  Alexander Afanasyev,et al.  Rapid traffic information dissemination using named data , 2012, NoM '12.

[30]  Enhong Chen,et al.  Unsupervised User Similarity Mining in GSM Sensor Networks , 2013, TheScientificWorldJournal.

[31]  Stef van Buuren,et al.  Flexible Imputation of Missing Data , 2012 .