Cloud City Traffic State Assessment System Using a Novel Architecture of Big Data

Recently, big data are widely applied to different field. This work presents a cloud city traffic state assessment system using a novel architecture of big data. The proposed system provides the real-time bus location and real-time traffic situation, especially the real-time traffic situation nearby, through open data, GPS, GPRS and cloud technologies. With the high-scalability cloud technologies, Hadoop and Spark, the proposed system architecture is first implemented successfully and efficiently. Next, we utilize three clustering methods, DBSCAN, K-Means, and Fuzzy C-Means to find the area of traffic jam in Taichung city and moving average to find the area of traffic jam in Taiwan Boulevard which is the main road in Taichung city. Finally, experimental results show the effectiveness and efficiency of the proposed system services via an advanced web technology. In addition, some experimental results indicate that the computing ability of Spark is better than that of Hadoop.

[1]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[2]  Prashant Pandey,et al.  Cloud computing , 2010, ICWET.

[3]  Ricardo J. G. B. Campello,et al.  Density-Based Clustering Based on Hierarchical Density Estimates , 2013, PAKDD.

[4]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[5]  Jacopo Urbani,et al.  AJIRA: A Lightweight Distributed Middleware for MapReduce and Stream Processing , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[6]  Le Gruenwald,et al.  High-Performance Spatial Query Processing on Big Taxi Trip Data Using GPGPUs , 2014, 2014 IEEE International Congress on Big Data.

[7]  Christian Sohler,et al.  A fast k-means implementation using coresets , 2006, SCG '06.

[8]  Xue Liu,et al.  HBaseMQ: A distributed message queuing system on clouds with HBase , 2013, 2013 Proceedings IEEE INFOCOM.

[9]  John Gantz,et al.  The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East , 2012 .

[10]  Lei Gu,et al.  Memory or Time: Performance Evaluation for Iterative Operation on Hadoop and Spark , 2013, 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing.

[11]  Ciprian Dobre,et al.  Intelligent services for Big Data science , 2014, Future Gener. Comput. Syst..

[12]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[13]  Mauro Iacono,et al.  Performance evaluation of NoSQL big-data applications using multi-formalism models , 2014, Future Gener. Comput. Syst..