Developing an optimised activity type annotation method based on classification accuracy and entropy indices

ABSTRACT The generation of substantial amounts of travel- and mobility-related data has spawned the emergence of the era of big data. However, this data generally lacks activity-travel information such as trip purpose. This deficiency led to the development of trip purpose inference (activity type imputation/annotation) techniques, of which the performance depends on the available input data and the (number of) activity type classes to infer. Aggregating activity types strongly increases the inference accuracy and is usually left to the discretion of the researcher. As this is open for interpretation, it undermines the reported inference accuracy. This study developed an optimised classification methodology by identifying classes of activity types with an optimal balance between improving model accuracy, and preserving activity information from the original data set. A sensitivity analysis was performed. Additionally, several machine learning algorithms are experimented with. The proposed method may be applied to any study area.

[1]  Peter R. Stopher,et al.  Search for a global positioning system device to measure person travel , 2008 .

[2]  Harry Timmermans,et al.  ALBATROSS: Multiagent, Rule-Based Model of Activity Pattern Decisions , 2000 .

[3]  Randall Guensler,et al.  Elimination of the Travel Diary: Experiment to Derive Trip Purpose from Global Positioning System Travel Data , 2001 .

[4]  Hjp Harry Timmermans,et al.  A learning-based transportation oriented simulation system , 2004 .

[5]  Mark Hickman,et al.  Trip purpose inference using automated fare collection data , 2014, Public Transp..

[6]  Michael Batty,et al.  Inferring building functions from a probabilistic model using public transportation data , 2014, Comput. Environ. Urban Syst..

[7]  O. Järv,et al.  Understanding monthly variability in human activity spaces: A twelve-month study using mobile phone call detail records , 2014 .

[8]  M. Bradley,et al.  A model for joint choice of daily activity pattern types of household members , 2005 .

[9]  Kees Maat,et al.  Deriving and validating trip purposes and travel modes for multi-day GPS-based travel surveys: A large-scale application in the Netherlands , 2009 .

[10]  Will Recker,et al.  Mining activity pattern trajectories and allocating activities in the network , 2015 .

[11]  A. Aoife,et al.  Analysis of National Travel Statistics in EuropeOPTIMISM WP2: Harmonisation of national travel statistics in Europe , 2013 .

[12]  H WittenIan,et al.  The WEKA data mining software , 2009 .

[13]  Hjp Harry Timmermans,et al.  Extracting activity-travel diaries from GPS data: towards integrated semi-automatic imputation , 2014 .

[14]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[15]  Tian Lan,et al.  Zooming into individuals to understand the collective: A review of trajectory-based travel behaviour studies , 2014 .

[16]  Mark Graham,et al.  Geography and the future of big data, big data and the future of geography , 2013 .

[17]  Davy Janssens,et al.  Building a validation measure for activity-based transportation models based on mobile phone data , 2014, Expert Syst. Appl..

[18]  Tom Bellemans,et al.  Data Mining Method for Smart Card Data Using Household Travel Survey: A Pilot Study of Public Transportation in Suwon, South Korea , 2015 .

[19]  Hjp Harry Timmermans,et al.  Detecting activity type from GPS traces using spatial and temporal information , 2015 .

[20]  Henry Leung,et al.  Data fusion in intelligent transportation systems: Progress and challenges - A survey , 2011, Inf. Fusion.

[21]  Toshiyuki Yamamoto,et al.  Deriving Personal Trip Data from GPS Data: A Literature Review on the Existing Methodologies , 2014 .

[22]  Marta C. González,et al.  Origin-destination trips by purpose and time of day inferred from mobile phone data , 2015 .

[23]  Lei Zhang,et al.  Imputing trip purposes for long-distance travel , 2015 .

[24]  Davy Janssens,et al.  Annotating mobile phone location data with activity purposes using machine learning algorithms , 2013, Expert Syst. Appl..

[25]  Gerhard Tröster,et al.  Recognizing composite daily activities from crowd-labelled social media data , 2016, Pervasive Mob. Comput..

[26]  Davy Janssens,et al.  Implementation Framework and Development Trajectory of FEATHERS Activity-Based Simulation Platform , 2010 .

[27]  M. Batty,et al.  Variability in Regularity: Mining Temporal Mobility Patterns in London, Singapore and Beijing Using Smart-Card Data , 2016, PloS one.

[28]  Davy Janssens,et al.  The Annotation of Global Positioning System (GPS) Data with Activity Purposes Using Multiple Machine Learning Algorithms , 2014 .

[29]  Kevin Manaugh,et al.  What is mixed use? Presenting an interaction method for measuring land use mix , 2013 .

[30]  Shanjiang Zhu,et al.  Imputing Trip Purpose Based on GPS Travel Survey Data and Machine Learning Methods , 2013 .

[31]  Stefan Schönfelder,et al.  Eighty Weeks of Global Positioning System Traces: Approaches to Enriching Trip Information , 2004 .

[32]  Joo-Young Kim,et al.  Travel behavior analysis using smart card data , 2016 .

[33]  Peter R. Stopher,et al.  A process for trip purpose imputation from Global Positioning System data , 2013 .

[34]  Yasuo Asakura,et al.  Behavioural data mining of transit smart card data: A data fusion approach , 2014 .

[35]  Peter Vovsha,et al.  Impact of Intrahousehold Interactions on Individual Daily Activity-Travel Patterns , 2004 .

[36]  Bruno Kochan,et al.  Implementation, validation and application of an activity-based transportation model for Flanders , 2012 .

[37]  Peter Vovsha,et al.  Evaluation of Two Methods for Identifying Trip Purpose in GPS-Based Household Travel Surveys , 2014 .

[38]  Davy Janssens,et al.  Semantic Annotation of Global Positioning System Traces , 2013 .

[39]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[40]  Yee Leung,et al.  Applying mobile phone data to travel behaviour research: A literature review , 2017 .

[41]  Kay W. Axhausen,et al.  Trip Purpose Identification from GPS Tracks , 2014 .

[42]  E. Miller,et al.  Modelling activity generation: a utility-based model for activity-agenda formation , 2009 .

[43]  Marta C. González,et al.  The path most traveled: Travel demand estimation using big data resources , 2015, Transportation Research Part C: Emerging Technologies.

[44]  Moshe Ben-Akiva,et al.  Exploratory Analysis of a Smartphone-Based Travel Survey in Singapore , 2015 .