Distributed Data Platform for Automotive Industry: A Robust Solution for Tackling Big Challenges of Big Data in Transportation Science

Nowadays, large amounts of data are being generated from numerous sources. Such a trend is evident in many research fields where the number of data producers is constantly increasing. For example, fields of transportation science and automotive industry may consider each vehicle on the road as a separate data producer which can generate large amounts of data. In the literature, the big data is commonly used as an umbrella term when discussing research related to the following data challenges: volume, variety, velocity, veracity and value. Furthermore, it is a common approach to use a variety of programming tools and methods for different data processing phases, i.e., data collection, data storage, and data analysis. In this paper, we present a distributed data platform that addresses the aforementioned challenges by relying on a specific design choices for each of data processing phases. We argument how such the data platform supports robustness, scalability, fault tolerance, and reliability by showcasing the two real-world use-cases from the transportation/automotive domain: (i) collection, storage, and analysis of the data generated by electric cleaners fleet, and (ii) collection, storage, and analysis of transaction data from EV charging stations, which is further used to develop the EV charging infrastructure.

[1]  Okyay Kaynak,et al.  Big Data for Modern Industry: Challenges and Trends [Point of View] , 2015, Proc. IEEE.

[2]  Pankaj Jalote,et al.  Fault tolerance in distributed systems , 1994 .

[3]  Vedran Podobnik,et al.  Evaluating Policies for Parking Lots Handling Electric Vehicles , 2018, IEEE Access.

[4]  Flavio Junqueira,et al.  ZooKeeper: Distributed Process Coordination , 2013 .

[5]  Cees T. A. M. de Laat,et al.  Addressing big data issues in Scientific Data Infrastructure , 2013, 2013 International Conference on Collaboration Technologies and Systems (CTS).

[6]  Muhammad Shiraz,et al.  Big Data: Survey, Technologies, Opportunities, and Challenges , 2014, TheScientificWorldJournal.

[7]  Inder Monga,et al.  Lambda architecture for cost-effective batch and speed big data processing , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[8]  Jun Rao,et al.  Building a Replicated Logging System with Apache Kafka , 2015, Proc. VLDB Endow..

[9]  Derek L. Eager,et al.  Achieving robustness in distributed database systems , 1983, TODS.

[10]  Antonella Molinaro,et al.  Information-centric networking for connected vehicles: a survey and future perspectives , 2016, IEEE Communications Magazine.

[11]  Vedran Podobnik,et al.  A data‐driven statistical approach for extending electric vehicle charging infrastructure , 2018 .

[12]  Florin Radulescu,et al.  MongoDB vs Oracle -- Database Comparison , 2012, 2012 Third International Conference on Emerging Intelligent Data and Web Technologies.

[13]  Vedran Podobnik,et al.  How do people value electric vehicle charging service? A gamified survey approach , 2019, Journal of Cleaner Production.

[14]  Ken Kennedy,et al.  Automotive big data: Applications, workloads and infrastructures , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[15]  Subir Biswas,et al.  Vehicle-to-vehicle wireless communication protocols for enhancing highway traffic safety , 2006, IEEE Communications Magazine.

[16]  Lori Bowen Ayre,et al.  Open Data: What It Is and Why You Should Care , 2017, Public Libr. Q..

[17]  Ayoub Ait Lahcen,et al.  Big Data technologies: A survey , 2017, J. King Saud Univ. Comput. Inf. Sci..

[18]  Murtaza Haider,et al.  Beyond the hype: Big data concepts, methods, and analytics , 2015, Int. J. Inf. Manag..

[19]  Zhi-Ping Fan,et al.  Product sales forecasting using online reviews and historical sales data: A method combining the Bass model and sentiment analysis , 2017 .

[20]  C. Murray Woodside,et al.  Evaluating the Scalability of Distributed Systems , 2000, IEEE Trans. Parallel Distributed Syst..

[21]  Mathias Johanson,et al.  Big Automotive Data: Leveraging large volumes of data for knowledge-driven product development , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[22]  Syed Akhter Hossain,et al.  NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and Comparison , 2013, ArXiv.