An OD Flow Clustering Method Based on Vector Constraints: A Case Study for Beijing Taxi Origin-Destination Data

Origin-destination (OD) flow pattern mining is an important research method of urban dynamics, in which OD flow clustering analysis discovers the activity patterns of urban residents and mine the coupling relationship of urban subspace and dynamic causes. The existing flow clustering methods are limited by the spatial constraints of OD points, rely on the spatial similarity of geographical points, and lack in-depth analysis of high-dimensional flow characteristics, and therefore it is difficult to find irregular flow clusters. In this paper, we propose an OD flow clustering method based on vector constraints (ODFCVC), which defines OD flow event point and OD flow vector to express the spatial location relationship and geometric flow behavior characteristics of OD flow. First, the OD flow vector coordinate system is normalized by the Euclidean distance-based OD flow event point spatial clustering, and then the OD flow clusters with similar flow patterns are mined using adjusted cosine similarity-based OD flow vector feature clustering. The transformation of OD data from point set space to vector space is realized by constraining the vector coordinate system and vector similarity through two-step clustering, which simplifies the calculation of high-dimensional similarity of OD flow and helps mining representative OD flow clusters in flow space. Due to the OD flow cluster property, the k-means algorithm is selected as the basic clustering logic in the two-step clustering method, and a sum of squared error perceptually important points algorithm considering silhouette coefficients (SSEPIP) is adopted to automatically extract the optimal cluster number without defining any parameters. Tested by origin-destination flow data in Beijing, China, new traffic flow communities based on traffic hubs are obtained by using the ODFCVC method, and irregular traffic flow clusters (including cluster mode, divergence mode, and convergence mode) with representative travel trends are found.

[1]  J. Dykes,et al.  Visualisation of Origins, Destinations and Flows with OD Maps , 2010 .

[2]  Philip S. Yu,et al.  A Survey of Uncertain Data Algorithms and Applications , 2009, IEEE Transactions on Knowledge and Data Engineering.

[3]  Yan Zhang,et al.  A Simple Line Clustering Method for Spatial Analysis with Origin-Destination Data and Its Application to Bike-Sharing Movement Data , 2018, ISPRS Int. J. Geo Inf..

[4]  Ran Tao,et al.  Spatial Cluster Detection in Spatial Flow Data , 2016 .

[5]  Gennady L. Andrienko,et al.  Spatio-temporal aggregation for visual analysis of movements , 2008, 2008 IEEE Symposium on Visual Analytics Science and Technology.

[6]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[7]  Diansheng Guo,et al.  Flow Mapping and Multivariate Visualization of Large Spatial Interaction Data , 2009, IEEE Transactions on Visualization and Computer Graphics.

[8]  Jason Dykes,et al.  Visualizing the Dynamics of London's Bicycle-Hire Scheme , 2011, Cartogr. Int. J. Geogr. Inf. Geovisualization.

[9]  Peng Gao,et al.  Discovering Spatial Patterns in Origin‐Destination Mobility Data , 2012, Trans. GIS.

[10]  Chenghu Zhou,et al.  Density-based clustering for data containing two types of points , 2015, Int. J. Geogr. Inf. Sci..

[11]  Jean-Daniel Fekete,et al.  MatrixExplorer: a Dual-Representation System to Explore Social Networks , 2006, IEEE Transactions on Visualization and Computer Graphics.

[12]  Xianchao Zhang,et al.  Multi-Task Multi-View Clustering , 2016, IEEE Transactions on Knowledge and Data Engineering.

[13]  Daoqin Tong,et al.  Measuring Spatial Autocorrelation of Vectors , 2015 .

[14]  周成虎,et al.  时空点过程:一种新的地学数据模型、分析方法和观察视角 , 2013 .

[15]  Fang Miao,et al.  A Trajectory Regression Clustering Technique Combining a Novel Fuzzy C-Means Clustering Algorithm with the Least Squares Method , 2018, ISPRS Int. J. Geo Inf..

[16]  Danny Holten,et al.  Hierarchical Edge Bundles: Visualization of Adjacency Relations in Hierarchical Data , 2006, IEEE Transactions on Visualization and Computer Graphics.

[17]  Eugene Zhang,et al.  Force-directed layout of origin-destination flow maps , 2017, Int. J. Geogr. Inf. Sci..

[18]  Denis Lalanne,et al.  Flowstrates: An Approach for Visual Exploration of Temporal Origin‐Destination Data , 2011, Comput. Graph. Forum.

[19]  Li Gong,et al.  Revealing travel patterns and city structure with taxi trip data , 2016 .

[20]  Bernhard Jenny,et al.  Automated layout of origin–destination flow maps: U.S. county-to-county migration 2009–2013 , 2017 .

[21]  Waldo R. Tobler,et al.  Experiments In Migration Mapping By Computer , 1987 .

[22]  Xuesong Zhou,et al.  Traffic zone division based on big data from mobile phone base stations , 2015 .

[23]  Diansheng Guo,et al.  Origin-Destination Flow Data Smoothing and Mapping , 2014, IEEE Transactions on Visualization and Computer Graphics.

[24]  Yang Liu,et al.  EKF–GPR-Based Fingerprint Renovation for Subset-Based Indoor Localization with Adjusted Cosine Similarity , 2018, Sensors.

[25]  Anita Graser,et al.  Untangling origin-destination flows in geographic information systems , 2019, Inf. Vis..

[26]  Chen Jia,et al.  Integrating algebraic multigrid method in spatial aggregation of massive trajectory data , 2018, Int. J. Geogr. Inf. Sci..

[27]  Jo Wood,et al.  Revealing Patterns and Trends of Mass Mobility Through Spatial and Temporal Abstraction of Origin-Destination Movement Data , 2017, IEEE Transactions on Visualization and Computer Graphics.

[28]  Earl R. Barnes An algorithm for partitioning the nodes of a graph , 1981, CDC 1981.

[29]  Fahui Wang,et al.  Urban land uses and traffic 'source-sink areas': Evidence from GPS-enabled taxi data in Shanghai , 2012 .

[30]  Jae-Gil Lee,et al.  TraClass: trajectory classification using hierarchical region-based and trajectory-based clustering , 2008, Proc. VLDB Endow..

[31]  Diansheng Guo,et al.  Mapping Large Spatial Flow Data with Hierarchical Clustering , 2014, Trans. GIS.

[32]  Sergio J. Rey,et al.  Visualizing regional income distribution dynamics , 2011 .

[33]  Erik Duval,et al.  Touching transport - a case study on visualizing metropolitan public transit on interactive tabletops , 2014, AVI.

[34]  Jeffrey Heer,et al.  Divided Edge Bundling for Directional Network Data , 2011, IEEE Transactions on Visualization and Computer Graphics.

[35]  Liu Qiliang,et al.  Towards a scale-driven theory for spatial clustering , 2017 .

[36]  Hans-Peter Kriegel,et al.  Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.

[37]  Tao Pei,et al.  Detecting arbitrarily shaped clusters in origin-destination flows using ant colony optimization , 2018, Int. J. Geogr. Inf. Sci..

[38]  Chenghu Zhou,et al.  Clustering of temporal event processes , 2013, Int. J. Geogr. Inf. Sci..

[39]  Gennady Andrienko,et al.  A General Framework for Using Aggregation in Visual Exploration of Movement Data , 2010 .

[40]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  Bettina Speckmann,et al.  Flow Map Layout via Spiral Trees , 2011, IEEE Transactions on Visualization and Computer Graphics.

[42]  Natalia Adrienko,et al.  Spatial Generalization and Aggregation of Massive Movement Data , 2011 .

[43]  Yiran Chen,et al.  HiSpatialCluster: A novel high‐performance software tool for clustering massive spatial points , 2018, Trans. GIS.