Distributed Modeling in a MapReduce Framework for Data-Driven Traffic Flow Forecasting

With the availability of increasingly more new data sources collected for transportation in recent years, the computational effort for traffic flow forecasting in standalone modes has become increasingly demanding for large-scale networks. Distributed modeling strategies can be utilized to reduce the computational effort. In this paper, we present a MapReduce-based approach to processing distributed data to design a MapReduce framework of a traffic forecasting system, including its system architecture and data-processing algorithms. The work presented here can be applied to many traffic forecasting systems with models requiring a learning process (e.g., the neural network approach). We show that the learning process of the forecasting model under our framework can be accelerated from a computational perspective. Meanwhile, model fusion, which is the key problem of distributed modeling, is explicitly treated in this paper to enhance the capability of the forecasting system in data processing and storage.

[1]  Pankaj Singh,et al.  Cloud Computing for Agent-Based Urban Transportation Systems , 2013 .

[2]  Fei-Yue Wang,et al.  Toward a Revolution in Transportation Operations: AI for Complex Systems , 2008, IEEE Intelligent Systems.

[3]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[4]  Jonathan Cohen,et al.  Graph Twiddling in a MapReduce World , 2009, Computing in Science & Engineering.

[5]  Yi Zhang,et al.  Short-term traffic flow forecasting of urban network based on dynamic STARIMA model , 2009, 2009 12th International IEEE Conference on Intelligent Transportation Systems.

[6]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[7]  Fei-Yue Wang,et al.  Artificial Societies for Integrated and Sustainable Development of Metropolitan Systems , 2004, IEEE Intell. Syst..

[8]  H. Akaike A new look at the statistical model identification , 1974 .

[9]  Jimmy J. Lin,et al.  Book Reviews: Data-Intensive Text Processing with MapReduce by Jimmy Lin and Chris Dyer , 2010, CL.

[10]  Shiliang Sun,et al.  A bayesian network approach to traffic flow forecasting , 2006, IEEE Transactions on Intelligent Transportation Systems.

[11]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[12]  Jason Hall,et al.  The limitations of artificial neural networks for traffic prediction , 1998, Proceedings Third IEEE Symposium on Computers and Communications. ISCC'98. (Cat. No.98EX166).

[13]  G. Zanetti,et al.  Parallelizing bioinformatics applications with MapReduce , 2008 .

[14]  Hans-Peter Kriegel,et al.  Effective and efficient distributed model-based clustering , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[15]  Kunle Olukotun,et al.  Map-Reduce for Machine Learning on Multicore , 2006, NIPS.

[16]  Shing Chung Josh Wong,et al.  Urban traffic flow prediction using a fuzzy-neural approach , 2002 .

[17]  Billy M. Williams,et al.  MODELING AND FORECASTING VEHICULAR TRAFFIC FLOW AS A SEASONAL STOCHASTIC TIME SERIES PROCESS , 1999 .

[18]  I Okutani,et al.  Dynamic prediction of traffic volume through Kalman Filtering , 1984 .

[19]  H. Liu,et al.  Type-2 fuzzy logic approach for short-term traffic forecasting , 2006 .

[20]  Fei-Yue Wang,et al.  Parallel Control and Management for Intelligent Transportation Systems: Concepts, Architectures, and Applications , 2010, IEEE Transactions on Intelligent Transportation Systems.