Automated Transportation Mode Detection Using Smart Phone Applications via Machine Learning: Case Study Mega City of Tehran

Through the past few decades, travel behaviors have become more complicated especially in mega cities such as Tehran, the capital of Iran. Decision makers require more accurate and comprehensive information to plan for city transportation. As opposed to traditional paper- based and telephone-based surveys, a new efficient and effective data collection method has been recently applied using information technology such as the global positioning system (GPS)-based data collection method which can track passengers’ trips. Having utilized this new method, the main aim of this study is to analyze the collected data in order to distinguish transportation modes used by passengers using a novel machine learning method called random forest. This model not only classifies transportation modes i.e., car, bus, and walking at a high accuracy of almost 96%, but also determines the most influential attributes in the process of classification based on two importance indices: mean decrease accuracy and Gini index. Results show that instant speed and accuracy of GPS track are the most influential attributes in the transportation modes classification. Transportation planners benefit a lot from this accurate and comprehensive travel behaviors data (used modes) for policy making.

[1]  I. Anderson,et al.  Practical Activity Recognition using GSM Data ∗ , .

[2]  F. Oswald,et al.  Use of the global positioning system to measure the out-of-home mobility of older adults with differing cognitive functioning , 2011, Ageing and Society.

[3]  Deborah Estrin,et al.  Using mobile phones to determine transportation modes , 2010, TOSN.

[4]  Eui-Hwan Chung,et al.  A Trip Reconstruction Tool for GPS-based Personal Travel Surveys , 2005 .

[5]  Chao Xu,et al.  Identifying travel mode from GPS trajectories through fuzzy pattern recognition , 2010, 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery.

[6]  Xing Xie,et al.  Learning transportation mode from raw gps data for geographic applications on the web , 2008, WWW.

[7]  Bagus Sartono,et al.  Identification of Affecting Factors on the GPA of First Year Students at Bogor Agricultural University Using Random Forest , 2013 .

[8]  Mirco Musolesi,et al.  Sensing meets mobile social networks: the design, implementation and evaluation of the CenceMe application , 2008, SenSys '08.

[9]  Miguel A. Labrador,et al.  Automating Mode Detection Using Neural Networks and Assisted GPS Data Collected Using GPS-Enabled Mobile Phones , 2008 .

[10]  Yoshida Hiroaki,et al.  Rapid Feature Selection Based on Random Forests for High-Dimensional Data , 2012 .

[11]  K. Axhausen,et al.  Habitual travel behaviour: Evidence from a six-week travel diary , 2003 .

[12]  Alain Rakotomamonjy,et al.  Variable Selection Using SVM-based Criteria , 2003, J. Mach. Learn. Res..

[13]  Peter R. Stopher,et al.  Deducing mode and purpose from GPS data , 2008 .

[14]  Monika Sester,et al.  Multi-stage approach to travel-mode segmentation and classification of gps traces , 2011 .

[15]  Eiji Hato,et al.  A study of the effectiveness of a household travel survey using GPS -equipped cell phones and a WEB diary through a comparative study with a paper based travel survey , 2006 .

[16]  Sean T. Doherty,et al.  MOVING BEYOND OBSERVED OUTCOMES: INTEGRATING GLOBAL POSITIONING SYSTEMS AND INTERACTIVE COMPUTER-BASED TRAVEL BEHAVIOR SURVEYS , 2001 .

[17]  Min Y. Mun,et al.  Parsimonious Mobility Classification using GSM and WiFi Traces , 2008 .

[18]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[19]  Mahmoud Mesbah,et al.  ATLAS Project: developing a mobile-based travel survey , 2013 .

[20]  Alireza Ermagun,et al.  Students’ Tendency to Walk to School: Case Study of Tehran , 2013 .

[21]  Gavin Brown,et al.  Ensemble Learning , 2010, Encyclopedia of Machine Learning and Data Mining.

[22]  Simon Bernard,et al.  Random Forest Classifiers : A Survey and Future Research Directions , 2013 .

[23]  N. Ohmori,et al.  TRAVEL BEHAVIOR DATA COLLECTED USING GPS AND PHS , 2000 .

[24]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[25]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[26]  Carolin Strobl,et al.  The behaviour of random forest permutation-based variable importance measures under predictor correlation , 2010, BMC Bioinformatics.

[27]  Henry A. Kautz,et al.  Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields , 2007, Int. J. Robotics Res..

[28]  D F Pearson,et al.  Comparison of Trip Determination Methods in Household Travel Surveys Enhanced by a Global Positioning System , 2005 .

[29]  George C. Runger,et al.  Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination , 2009, J. Mach. Learn. Res..

[30]  Kristin K. Nicodemus,et al.  Letter to the Editor: On the stability and ranking of predictors from random forest variable importance measures , 2011, Briefings Bioinform..

[31]  Peter R. Stopher,et al.  In-Depth Comparison of Global Positioning System and Diary Records , 2011 .

[32]  Randall Guensler,et al.  Elimination of the Travel Diary: Experiment to Derive Trip Purpose from Global Positioning System Travel Data , 2001 .

[33]  Sirui Liu,et al.  Incorporating Household Gathering and Mode Decisions in Large‐Scale No‐Notice Evacuation Modeling , 2014, Comput. Aided Civ. Infrastructure Eng..

[34]  Paola Zuccolotto,et al.  Variable Selection Using Random Forests , 2006 .

[35]  A. Zeileis,et al.  Danger: High Power! – Exploring the Statistical Properties of a Test for Random Forest Variable Importance , 2008 .