论文信息 - Q value-based Dynamic Programming with SARSA Learning for real time route guidance in large scale road networks

Q value-based Dynamic Programming with SARSA Learning for real time route guidance in large scale road networks

In this paper, a distributed dynamic traffic management model has been proposed to guide the vehicles, in order to minimize the computation time, make full use of real time traffic information and consequently improve the efficiency of the traffic system. For making the model work, we proposed a new dynamic route determination method, in which Q value-based Dynamic Programming and Sarsa Learning are combined to calculate the approximate optimal traveling time from each section to the destinations in the road networks. The proposed traffic management model is applied to the large scale microscopic simulator SOUND/4U based on the real world road network of Kurosaki, Kitakyushu in Japan. The simulation results show that the proposed method could reduce the traffic congestion and improve the efficiency of the traffic system effectively compared with the conventional method in the real world road network.

Shingo Mabu | Kotaro Hirasawa | Jing Zhou | Bing Li | Shanqing Yu

[1] Jeremy Woolley,et al. Integration of the global positioning system and geographical information systems for traffic congestion studies , 2000 .

[2] Thambipillai Srikanthan,et al. Heuristic techniques for accelerating hierarchical routing on road networks , 2002, IEEE Trans. Intell. Transp. Syst..

[3] David Eppstein,et al. Finding the k Shortest Paths , 1999, SIAM J. Comput..

[4] Barbara Kanninen,et al. INTELLIGENT TRANSPORTATION SYSTEMS: AN ECONOMIC AND ENVIRONMENTAL POLICY ASSESSMENT , 1996 .

[5] Jian-Min Xu,et al. A dynamic route guidance arithmetic based on reinforcement learning , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[6] Klaus Bogenberger,et al. Reliable Pretrip Multipath Planning and Dynamic Adaptation for a Centralized Road Navigation System , 2007, IEEE Transactions on Intelligent Transportation Systems.

[7] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[8] Sakti Pramanik,et al. An Efficient Path Computation Model for Hierarchically Structured Topographical Road Maps , 2002, IEEE Trans. Knowl. Data Eng..

[9] Elke A. Rundensteiner,et al. Hierarchical optimization of optimal path finding for transportation applications , 1996, CIKM '96.

[10] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[11] Osamu Katai,et al. Multiagent model of VICS , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[12] Marta M. B. Pascoal,et al. Deviation Algorithms for Ranking Shortest Paths , 1999, Int. J. Found. Comput. Sci..

[13] Elke A. Rundensteiner,et al. Hierarchical Encoded Path Views for Path Query Processing: An Optimal Model and Its Performance Evaluation , 1998, IEEE Trans. Knowl. Data Eng..

[14] Nils J. Nilsson,et al. A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[15] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.

[16] Edsger W. Dijkstra,et al. A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[17] Shingo Mabu,et al. Optimal Route Based on Dynamic Programming for Road Networks , 2008, J. Adv. Comput. Intell. Intell. Informatics.

[18] Don J. Torrieri. Algorithms for finding an optimal set of short disjoint paths in a communication network , 1992, IEEE Trans. Commun..

[19] Mitsuo Gen,et al. Multilayer Traffic Network Optimized by Multiobjective Genetic Clustering Algorithm , 2009, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..