Big Data in Road Transport and Mobility Research

Abstract Ubiquitous computing has changed the acquisition of mobility data, with two aspects contributing: the high penetration rate and the ability to capture and share information on a continuous basis. This applies to geolocation information, operational mobile phone data, and also, social network crowdsourced information. Additionally, under the umbrella of the Internet of Things trend, the deployment of the Connected Vehicle (Car-as-a-sensor) concept, supported by advanced V2X communications, provides massive data volume. For all these cases, data are open to never before seen opportunities to analyze and predict individual and aggregated mobility patterns. Big Data refers to the processsing capabilities of such an explosion in the amount, quality, and heterogeneity of available data. This chapter will review the most relevant data sources, introduce the underlying techniques supporting the BigData paradigm and, finally, provide a list of some relevant applications in the transport and mobility domain.

[1]  Jacques Teller,et al.  Urban Ontologies for an improved communication in urban civil engineering projects , 2005 .

[2]  Freddy Lécué,et al.  Westland row why so slow?: fusing social media and linked data sources for understanding real-time traffic conditions , 2013, IUI '13.

[3]  Xiaoru Yuan,et al.  Visual Traffic Jam Analysis Based on Trajectory Data , 2013, IEEE Transactions on Visualization and Computer Graphics.

[4]  Hossein Haghshenas,et al.  Applying AHP and Clustering Approaches for Public Transportation Decisionmaking: A Case Study of Isfahan City , 2016 .

[5]  Tazin Malgundkar GIS Driven Urban Traffic Analysis Based on Ontology , 2012 .

[6]  Jiwon Kim,et al.  Trajectory Clustering for Discovering Spatial Traffic Flow Patterns in Road Networks , 2015 .

[7]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[8]  A. Govardhan,et al.  Application of Factor Analysis to k-means Clustering Algorithm on Transportation Data , 2014 .

[9]  Dong Xu,et al.  Exploiting Low-Rank Structure from Latent Domains for Domain Generalization , 2014, ECCV.

[10]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[11]  Arindam Banerjee,et al.  Anomaly detection using manifold embedding and its applications in transportation corridors , 2009, Intell. Data Anal..

[12]  Shaomin Wu,et al.  A review on coarse warranty data and analysis , 2013, Reliab. Eng. Syst. Saf..

[13]  X. Huo,et al.  A Survey of Manifold-Based Learning Methods , 2007 .

[14]  Krishna P. Jagannathan,et al.  A multi-level clustering approach for forecasting taxi travel demand , 2016, 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC).

[15]  Catriel Beeri,et al.  A Sophisticate's Introduction to Database Normalization Theory , 1978, VLDB.

[16]  H. B. Barlow,et al.  Unsupervised Learning , 1989, Neural Computation.

[17]  Fatos Xhafa,et al.  Geometrical and topological approaches to Big Data , 2017, Future Gener. Comput. Syst..

[18]  John H. L. Hansen,et al.  Unsupervised driving performance assessment using free-positioned smartphones in vehicles , 2016, 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC).

[19]  João Mendes-Moreira,et al.  An Incremental Probabilistic Model to Predict Bus Bunching in Real-Time , 2014, IDA.

[20]  Doo-Heon Song,et al.  Vehicle Classification by Road Lane Detection and Model Fitting Using a Surveillance Camera , 2006, J. Inf. Process. Syst..

[21]  Yu Zheng,et al.  Trajectory Data Mining , 2015, ACM Trans. Intell. Syst. Technol..

[22]  Nagarajan Kandasamy,et al.  A New Approach to Dimensionality Reduction for Anomaly Detection in Data Traffic , 2016, IEEE Transactions on Network and Service Management.

[23]  Montasir M. Abbas,et al.  A two-step segmentation algorithm for behavioral clustering of naturalistic driving styles , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).

[24]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[25]  Gilles Falquet,et al.  Ontologies for the Integration of Air Quality Models and 3D City Models , 2012, ArXiv.

[26]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[27]  Goran Martinović,et al.  The Public Transportation BigData Clustering , 2013 .

[28]  Hilmi Berk Celikoglu,et al.  Public transportation trip flow modeling with generalized regression neural networks , 2007, Adv. Eng. Softw..

[29]  Su Yang,et al.  Anomaly Detection on Collective Moving Patterns: Manifold Learning Based Analysis of Traffic Streams , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[30]  Yuncai Liu,et al.  Traffic incident detection by multiple kernel support vector machine ensemble , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[31]  Lewis D. Hopkins,et al.  Ontology for Land Development Decisions and Plans , 2007, Ontologies for Urban Development.

[32]  Dino Pedreschi,et al.  Trajectory pattern mining , 2007, KDD '07.

[33]  Daniel Sun,et al.  Traffic Congestion Evaluation Method for Urban Arterials , 2014 .

[34]  S. Forward The theory of planned behaviour: The role of descriptive norms and past behaviour in the prediction of drivers’ intentions to violate , 2009 .

[35]  Qingquan Li,et al.  Activity identification from GPS trajectories using spatial temporal POIs' attractiveness , 2010, LBSN '10.

[36]  Sattar Hashemi,et al.  Road Traffic Prediction Using Context-Aware Random Forest Based on Volatility Nature of Traffic Flows , 2013, ACIIDS.

[37]  Hani S. Mahmassani,et al.  Spatial Trajectory Clustering for Potential Route Identification and Participation Analysis for Carpool Commuters , 2016 .

[38]  Muhammad Tayyab Asif,et al.  Spatiotemporal Patterns in Large-Scale Traffic Speed Prediction , 2014, IEEE Transactions on Intelligent Transportation Systems.

[39]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[40]  Konstantinos Kalpakis,et al.  Spatio-temporal coupled Bayesian Robust Principal Component Analysis for road traffic event detection , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).

[41]  Simon P. Wilson,et al.  Multivariate Short-Term Traffic Flow Forecasting Using Bayesian Vector Autoregressive Moving Average Model , 2012 .

[42]  Ludovic Leclercq,et al.  Clustering Approach for Assessing the Travel Time Variability of Arterials , 2014 .

[43]  Jing Yang,et al.  SemanticTraj: A New Approach to Interacting with Massive Taxi Trajectories , 2017, IEEE Transactions on Visualization and Computer Graphics.

[44]  Oliver Niggemann,et al.  Big Data and Machine Learning for the Smart Factory—Solutions for Condition Monitoring, Diagnosis and Optimization , 2017 .

[45]  Tao Cheng,et al.  Inferring hybrid transportation modes from sparse GPS data using a moving window SVM classification , 2012, Comput. Environ. Urban Syst..

[46]  C'eline L'evy-Leduc,et al.  Detection and localization of change-points in high-dimensional network traffic data , 2009, 0908.2310.

[47]  C. G. Keller,et al.  Will the Pedestrian Cross? A Study on Pedestrian Path Prediction , 2014, IEEE Transactions on Intelligent Transportation Systems.

[48]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[49]  Yanmin Zhu,et al.  A Survey on Trajectory Data Mining: Techniques and Applications , 2016, IEEE Access.

[50]  A. Stewart Fotheringham,et al.  Principal Component Analysis on Spatial Data: An Overview , 2013 .

[51]  Jie Shan,et al.  Trend-Residual Dual Modeling for Detection of Outliers in Low-Cost GPS Trajectories , 2016, Sensors.

[52]  M. Trépanier,et al.  Detection of Activities of Public Transport Users by Analyzing Smart Card Data , 2012 .

[53]  Zuo Zhang,et al.  A data-driven approach for duration evaluation of accident impacts on urban intersection traffic flow , 2016, 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC).

[54]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[55]  Michael F. P. O'Boyle,et al.  Automatic Feature Generation for Machine Learning Based Optimizing Compilation , 2009, 2009 International Symposium on Code Generation and Optimization.

[56]  Bart De Schutter,et al.  Integrated Model Predictive Traffic and Emission Control Using a Piecewise-Affine Approach , 2013, IEEE Transactions on Intelligent Transportation Systems.

[57]  George H. John Robust Decision Trees: Removing Outliers from Databases , 1995, KDD.

[58]  Li Li,et al.  Efficient missing data imputing for traffic flow by considering temporal and spatial dependence , 2013 .

[59]  Xiangliang Zhang,et al.  Detecting Anomaly in Traffic Flow from Road Similarity Analysis , 2016, WAIM.

[60]  Shin Hyoung Park,et al.  Identification of Influential Weather Factors on Traffic Safety Using K-means Clustering and Random Forest , 2016 .

[61]  Harun Uguz,et al.  A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm , 2011, Knowl. Based Syst..

[62]  Chunhua Hu,et al.  A method for real-time trajectory monitoring to improve taxi service using GPS big data , 2016, Inf. Manag..

[63]  Toshiyuki Yamamoto,et al.  Identification of activity stop locations in GPS trajectories by density-based clustering method combined with support vector machines , 2015 .

[64]  Mark Crovella,et al.  Mining anomalies using traffic feature distributions , 2005, SIGCOMM '05.

[65]  Billy M. Williams,et al.  Modeling and Forecasting Vehicular Traffic Flow as a Seasonal ARIMA Process: Theoretical Basis and Empirical Results , 2003, Journal of Transportation Engineering.

[66]  Alberto Muñoz,et al.  Self-organizing maps for outlier detection , 1998, Neurocomputing.

[67]  Peter Widhalm,et al.  Discovering urban activity patterns in cell phone data , 2015, Transportation.

[68]  Francesco Masulli,et al.  Layered ensemble model for short-term traffic flow forecasting with outlier detection , 2016, 2016 IEEE 2nd International Forum on Research and Technologies for Society and Industry Leveraging a better tomorrow (RTSI).

[69]  Mohammad S. Obaidat,et al.  Dimensionality reduction and feature extraction applications in identifying computer users , 1991, IEEE Trans. Syst. Man Cybern..

[70]  Jian Xu,et al.  A feature-based method for traffic anomaly detection , 2016, UrbanGIS '16.