A Fast Clustering Approach for Identifying Traffic Congestions

Density-based clustering, for instance, DBSCAN, is an important approach for pattern recognition and data mining, and has been widely used in many applications. Under large scale streaming data environment, however, DBSCAN suffers from a heavy computational cost because it examines distances between each points multiple times. When dealing with transportation-related applications which usually requires calculating road network distance instead of Euclidean distance, DBSCAN has difficult to meet real-time computation requirement. Focusing on fast identifying linear events, this paper utilizes linear feature to improve the efficiency of clustering by introducing linear referencing system (LRS). LRS has long been used in managing linear features, which could simplify shortest-path computation into 1-dimensional relative distance calculation, thus can significantly reduce computational complexity and cost, and meet the real-time analysis requirement of streaming data. Using vehicle GPS trajectory as an example, this study designs a LRS and its associated dynamic segmentation method for identifying traffic congestions. Experiment results proved the flexibility and efficiency of the proposed LRS-based clustering approach in identifying traffic congestions.