A Data Mining Approach to Creating Fundamental Traffic Flow Diagram

Abstract This paper investigates application of clustering techniques in partitioning traffic flow data to congested and free flow regimes. Clustering techniques identify the similarities and dissimilarities between data, and classify the data into groups with similar characteristics. Such techniques have been successfully used in market research, astronomy, psychiatry, and transportation. A framework is proposed for clustering traffic data based on fundamental traffic flow variables. Three types of clustering techniques are investigated: 1) connectivity-based clustering, 2) centroid-based clustering, and 3) distribution-based clustering. Specifically, hierarchical clustering, K-means clustering and general mixture model (GMM) were investigated. Traffic sensor data from three freeway bottleneck locations in two major U.S. metropolitan areas, St. Louis, Missouri, and Twin Cities, Minnesota, were used in the study. Various combinations of traffic variables were investigated for all three clustering techniques. The results indicated that the clustering is an effective way to partition traffic data into the free flow and congested flow regimes. Partitioned traffic data can be used to create fundamental traffic flow diagrams and macroscopic traffic stream models. Using speeds, or both speeds and occupancies as input variables produced the best clustering results. The performance of K-means and hierarchical clustering techniques were comparable to each other and they outperformed GMM clustering.

[1]  Xiaolei Ma,et al.  Mining smart card data for transit riders’ travel patterns , 2013 .

[2]  James H Banks,et al.  Automated Analysis of Cumulative Flow and Speed Curves , 2009 .

[3]  Shi Wenzhong,et al.  Traffic Flow Data Mining and Evaluation Based on Fuzzy Clustering Techniques , 2011 .

[4]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[5]  Anil K. Jain,et al.  Simultaneous feature selection and clustering using mixture models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  James H Banks Investigation of Some Characteristics of Congested Flow , 1999 .

[8]  Witold Pedrycz,et al.  Data Mining: A Knowledge Discovery Approach , 2007 .

[9]  Lily Elefteriadou,et al.  Defining Freeway Capacity as Function of Breakdown Probability , 2001 .

[10]  Satish V. Ukkusuri,et al.  A clustering regression approach: A comprehensive injury severity analysis of pedestrian-vehicle cr , 2013 .

[11]  J. Hair Multivariate data analysis , 1972 .

[12]  Lu Sun,et al.  Development of Multiregime Speed–Density Relationships by Cluster Analysis , 2005 .

[13]  Praveen Edara,et al.  Optimizing Freeway Traffic Sensor Locations by Clustering Global-Positioning-System-Derived Speed Patterns , 2010, IEEE Transactions on Intelligent Transportation Systems.

[14]  James C. Bezdek,et al.  Some new indexes of cluster validity , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[15]  James H Banks Review of Empirical Research on Congested Freeway Flow , 2002 .

[16]  Hideki Nakamura,et al.  Characteristics of Breakdown Phenomenon in Merging Sections of Urban Expressways in Japan , 2007 .

[17]  James H Banks,et al.  New Approach to Bottleneck Capacity Analysis: Final Report , 2006 .

[18]  Jaimyoung Kwon,et al.  Automatic Calibration of the Fundamental Diagram and Empirical Observations on Capacity , 2009 .

[19]  Ujjwal Maulik,et al.  Performance Evaluation of Some Clustering Algorithms and Validity Indices , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Sandra Amarendra Kumar,et al.  Clustering of Pavement Stretches and Determining Optimum Number of Clusters for Pavement Maintenance , 2013 .

[21]  Ujjwal Maulik,et al.  Validity index for crisp and fuzzy clusters , 2004, Pattern Recognit..

[22]  K. alik,et al.  Validity index for clusters of different sizes and densities , 2011 .

[23]  Mehdi Azimi,et al.  Categorizing Freeway Flow Conditions by Using Clustering Methods , 2010 .

[24]  Roberto Horowitz,et al.  Probabilistic Graphical Models of Fundamental Diagram Parameters for Simulations of Freeway Traffic , 2011 .

[25]  Mei Chen,et al.  A Nested Clustering Technique for Freeway Operating Condition Classification , 2007, Comput. Aided Civ. Infrastructure Eng..

[26]  H. M. Zhang,et al.  Fundamental Diagram of Traffic Flow , 2011 .

[27]  Fred L. Hall,et al.  FREEWAY CAPACITY DROP AND THE DEFINITION OF CAPACITY , 1991 .

[28]  Lelitha Vanajakshi,et al.  Development of Optimized Traffic Stream Models Under Heterogeneous Traffic Conditions , 2012 .

[29]  Wei Huang,et al.  A clustering approach to online freeway traffic state identification using ITS data , 2010 .

[30]  James H Banks Flow Breakdown at Freeway Bottlenecks , 2009 .

[31]  Kaan Ozbay,et al.  A Comparative Methodology for Estimating the Capacity of a Freeway Section , 2007, 2007 IEEE Intelligent Transportation Systems Conference.

[32]  Sam Yagar,et al.  Exploration of the Breakdown Phenomenon in Freeway Traffic , 1998 .