Driving behaviors analysis based on feature selection and statistical approach: a preliminary study

Due to the prevalence of IoV technology, big data has increasingly been promoted as a revolutionary development in a variety of applications. Indeed, the received big data from IoV is valuable particularly for those involved in analyzing driver’s behaviors. For instance, in the fleet management domain, fleet administrators are interested in fine-grained information about fleet usage, which is influenced by different driver usage patterns. In the vehicle insurance market, usage-based insurance or pay-as-you-drive schemes aim to adapt the insurance premium to individual driver behavior or even to provide various value-added services to policy holders. These applications can be expected to improve and to make safer the driving style of various individuals. Nowadays, big data analysis is becoming indispensable for automatic discovering of intelligence that is involved in the frequently occurring patterns and hidden rules. It is essential and necessary to study how to utilize these large-scale data. Regarding driving behaviors analysis, this paper presents a preliminary study based on feature selection and statistical approach. Feature selection is one of the important and frequently used techniques in data preprocessing for big data mining. Feature selection, as a dimensionality reduction technique, focuses on choosing a small subset of the significant features from the original data by removing irrelevant or redundant features. According to selection process, the most significant feature is vehicle speed for the collected vehicular data. Afterward, the statistical approach calculates skewness and dispersion in speed distribution as the statistical features for driving behaviors analysis. Finally, the established classification rules not only provide data-driven services and big data analytics but also offer training data samples for supervised machine learning algorithms. To validate the feasibility of the proposed method, over 150 drivers and more than 200,000 trips are verified in the simulation. As expected, experimental results are well matched with our observations.

[1]  Athanasios V. Vasilakos,et al.  Machine learning on big data: Opportunities and challenges , 2017, Neurocomputing.

[2]  Daoqiang Zhang,et al.  Constraint Score: A new filter method for feature selection with pairwise constraints , 2008, Pattern Recognit..

[3]  Reza Malekian,et al.  Accurate Vehicle Location System Using RFID, an Internet of Things Approach , 2016, Sensors.

[4]  Victor I. Chang,et al.  Composable architecture for rack scale big data computing , 2017, Future Gener. Comput. Syst..

[5]  Matteo Bonato,et al.  Robust Estimation of Skewness and Kurtosis in Distributions with Infinite Higher Moments , 2010 .

[6]  Shin-Kyung Lee,et al.  Vehicle-generated data exchange protocol for Remote OBD inspection and maintenance , 2011, 2011 6th International Conference on Computer Sciences and Convergence Information Technology (ICCIT).

[7]  JIANPING LI,et al.  Feature Selection via Least Squares Support Feature Machine , 2007, Int. J. Inf. Technol. Decis. Mak..

[8]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[9]  Myoung-Hee Kim,et al.  A convergence data model for medical information related to acute myocardial infarction , 2016, Human-centric Computing and Information Sciences.

[10]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[11]  Simon Washington,et al.  Statistical and Econometric Methods for Transportation Data Analysis (2nd Edition) , 2010 .

[12]  Hyunju Lee,et al.  Task Management System According to Changes in the Situation Based on IoT , 2017, J. Inf. Process. Syst..

[13]  Isaac Skog,et al.  Smartphone-Based Measurement Systems for Road Vehicle Traffic Monitoring and Usage-Based Insurance , 2014, IEEE Systems Journal.

[14]  Tai-Hoon Kim,et al.  Big data applications for healthcare: preface to special issue , 2016, The Journal of Supercomputing.

[15]  BeomSeok Kim,et al.  A Distributed Coexistence Mitigation Scheme for IoT-Based Smart Medical Systems , 2017, J. Inf. Process. Syst..

[16]  Theodore L. Willke,et al.  A survey of inter-vehicle communication protocols and their applications , 2009, IEEE Communications Surveys & Tutorials.

[17]  Younghee Kim,et al.  Mining Frequent Itemsets with Normalized Weight in Continuous Data Streams , 2010, J. Inf. Process. Syst..

[18]  S. Washington,et al.  Statistical and Econometric Methods for Transportation Data Analysis , 2010 .

[19]  Athanasios V. Vasilakos,et al.  The role of big data analytics in Internet of Things , 2017, Comput. Networks.

[20]  Ranjan K. Mallik,et al.  A Probabilistic Approach to Modeling Users' Network Selection in the Presence of Heterogeneous Wireless Networks , 2014, IEEE Transactions on Vehicular Technology.

[21]  F. McKenna,et al.  Drivers' hazard perception ability: Situation awareness on the road , 2004 .

[22]  Eleni I. Vlahogianni,et al.  Innovative Insurance Schemes: Pay As/how You Drive , 2016 .

[23]  Mu-Song Chen,et al.  Neuro-fuzzy approach for online message scheduling , 2015, Eng. Appl. Artif. Intell..

[24]  Widyawan,et al.  Internet of Things (IoT) Framework for Granting Trust among Objects , 2017, J. Inf. Process. Syst..

[25]  Gregory Piatetsky-Shapiro,et al.  High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality , 2000 .

[26]  Zongde Fang,et al.  Robust control of integrated motor-transmission powertrain system over controller area network for automotive applications , 2015 .

[27]  Tao Jiang,et al.  Efficient and robust feature extraction by maximum margin criterion , 2003, IEEE Transactions on Neural Networks.

[28]  Neeraj Kumar,et al.  Providing healthcare services on-the-fly using multi-player cooperation game theory in Internet of Vehicles (IoV) environment , 2015, Digit. Commun. Networks.

[29]  Yücel Saygin,et al.  Self-configuring data mining for ubiquitous computing , 2013, Inf. Sci..

[30]  Dimitar Filev,et al.  From vehicle stability control to intelligent personal minder: Real-time vehicle handling limit warning and driver style characterization , 2009, 2009 IEEE Workshop on Computational Intelligence in Vehicles and Vehicular Systems.