Understanding the effects of trip patterns on spatially aggregated crashes with large-scale taxi GPS data.

The primary objective of this study was to investigate how trip pattern variables extracted from large-scale taxi GPS data contribute to the spatially aggregated crashes in urban areas. The following five types of data were collected: crash data, large-scale taxi GPS data, road network attributes, land use features and social-demographic data. A data-driven modeling approach based on Latent Dirichlet Allocation (LDA) was proposed for discovering hidden trip patterns from a taxi GPS dataset, and a total of fifty trip patterns were identified. The collected data and the identified trip patterns were further aggregated into167 ZIP Code Tabulation Areas (ZCTA). Random forest technique was used to identify the factors that contributed to total, PDO and fatal-plus-injury crashes in the selected ZCTAs during the study period. Geographically weighted Poisson regression (GWPR) models were then developed to establish a relationship between the crashes and the contributing factors selected by the random forest technique. Comparative analyses were conducted to compare the performance of the GWPR models that considered traditional traffic exposure variables only, trip pattern variables only, and both traditional exposure and trip pattern variables. The model specification results suggest that the trip pattern variables significantly affected the crash counts in the selected ZCTAs, and the models that considered both the traditional traffic exposure and the trip pattern variables had the best goodness-of-fit in terms of the lowest MAD and AICc values.

[1]  Daniel Gatica-Perez,et al.  Discovering routines from large-scale human locations using probabilistic topic models , 2011, TIST.

[2]  Moshe Ben-Akiva,et al.  Text analysis in incident duration prediction , 2013 .

[3]  Mohamed Abdel-Aty,et al.  Aggregate nonparametric safety analysis of traffic zones. , 2012, Accident; analysis and prevention.

[4]  Chengcheng Xu,et al.  Incorporating twitter-based human activity information in spatial analysis of crashes in urban areas. , 2017, Accident; analysis and prevention.

[5]  Divera A M Twisk,et al.  Trends in young driver risk and countermeasures in European countries. , 2007, Journal of safety research.

[6]  Samiul Hasan,et al.  Exploring the determinants of pedestrian-vehicle crash severity in New York City. , 2013, Accident; analysis and prevention.

[7]  Christian Schneider,et al.  Spatiotemporal Patterns of Urban Human Mobility , 2012, Journal of Statistical Physics.

[8]  Xi Liu,et al.  Revealing daily travel patterns and city structure with taxi trip data , 2013, ArXiv.

[9]  Tarek Sayed,et al.  Using Macrolevel Collision Prediction Models to Conduct Road Safety Evaluation of Regional Transportation Plan , 2008 .

[10]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[11]  Chih-Hao Wang,et al.  A geographically weighted regression approach to investigating the spatially varied built-environment effects on community opportunity , 2017 .

[12]  Helai Huang,et al.  County-Level Crash Risk Analysis in Florida: Bayesian Spatial Modeling , 2010 .

[13]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[14]  P. Jovanis,et al.  Spatial analysis of fatal and injury crashes in Pennsylvania. , 2006, Accident; analysis and prevention.

[15]  Yunhao Liu,et al.  Big Data: A Survey , 2014, Mob. Networks Appl..

[16]  Chris Brunsdon,et al.  Geographically Weighted Regression: The Analysis of Spatially Varying Relationships , 2002 .

[17]  Andrzej P. Tarko,et al.  METHODOLOGY FOR IDENTIFYING HIGHWAY SAFETY PROBLEM AREAS , 1996 .

[18]  Patrice Aknin,et al.  Spatio-temporal Analysis of Dynamic Origin-Destination Data Using Latent Dirichlet Allocation. Application to Vélib' Bikesharing System of Paris. , 2014 .

[19]  Laurie A. Schintler,et al.  Sensitivity of location-sharing services data: evidence from American travel pattern , 2015 .

[20]  Robert B Noland,et al.  Traffic fatalities and injuries: the effect of changes in infrastructure and other trends. , 2003, Accident; analysis and prevention.

[21]  Satish V. Ukkusuri,et al.  Urban activity pattern classification using topic models from online geo-location data , 2014 .

[22]  Mohamed Abdel-Aty,et al.  Analysis of Residence Characteristics of At-Fault Drivers in Traffic Crashes , 2014 .

[23]  Alireza Hadayeghi,et al.  Temporal transferability and updating of zonal level accident prediction models. , 2006, Accident; analysis and prevention.

[24]  Gudmundur F. Ulfarsson,et al.  Spatial regression analysis of traffic crashes in Seoul. , 2016, Accident; analysis and prevention.

[25]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[26]  Satish V. Ukkusuri,et al.  Exploring Spatial Variation of Urban Taxi Ridership Using Geographically Weighted Regression , 2015 .

[27]  Yinhai Wang,et al.  Uncovering urban human mobility from large scale taxi GPS data , 2015 .

[28]  Jaeyoung Lee,et al.  Investigating macro-level hotzone identification and variable importance using big data: A random forest models approach , 2016, Neurocomputing.

[29]  Andrew B. Lawson,et al.  Bayesian Disease Mapping: Hierarchical Modeling in Spatial Epidemiology , 2008 .

[30]  N P Gregersen,et al.  Lifestyle and accidents among young drivers. , 1994, Accident; analysis and prevention.

[31]  Pengpeng Xu,et al.  Modeling crash spatial heterogeneity: random parameter versus geographically weighting. , 2015, Accident; analysis and prevention.

[32]  Mohamed Abdel-Aty,et al.  Intersection crash prediction modeling with macro-level data from various geographic units. , 2017, Accident; analysis and prevention.

[33]  Ali Naderan,et al.  Aggregate crash prediction models: introducing crash generation concept. , 2010, Accident; analysis and prevention.

[34]  A. Shalaby,et al.  Development of planning level transportation safety tools using Geographically Weighted Poisson Regression. , 2010, Accident; analysis and prevention.

[35]  Weixu Wang,et al.  Using Geographically Weighted Poisson Regression for county-level crash modeling in California , 2013 .

[36]  Hao Wang,et al.  Comparative analysis of the spatial analysis methods for hotspot identification. , 2014, Accident; analysis and prevention.