Integrating flight-related information into a (Big) data lake

Flight cancellations, departure delays, congestion in taxi times and airborne holding delays are increasingly frequent problems that negatively impact the performance, fuel burn, emissions rate and customer satisfaction at major airports in the world. However, this is just a brushstroke of the future to come. The dramatic growth in the air traffic levels has become a problem of paramount importance, leading into an increased interest for enhancing the current Air Traffic Management (ATM) systems. The main objective is to being able to cope with the sustained air traffic growth under safe, economic, efficient and environmental friendly working conditions. The ADS-B (Automatic Dependent Surveillance — Broadcast) technology plays a major role in the new ATM systems, since it provides more accurate real-time positioning information than secondary radars, in spite of using a cheaper infrastructure. However, the main flaw in the use of ADS-B technology is the generation of large volumes of data, that, when merged with other flight-related information, faces important scalability issues. In this work, we start off from a previously developed data lake for the support of the full ADS-B data life-cycle in a scalable and cost-effective way, and propose a data architecture to integrate data from different providers and reconstruct flight trajectories that can ultimately be used to improve the efficiency in flight operations. This data architecture is also evaluated using a 2-week testbed which reports some interesting figures about its effectiveness.

[1]  Busyairah Syd Ali,et al.  Automatic Dependent Surveillance Broadcast (ADS-B) , 2017 .

[2]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[3]  Anne Laurent,et al.  The next information architecture evolution: the data lake wave , 2016, MEDES.

[4]  Mary Roth,et al.  Data Wrangling: The Challenging Yourney from the Wild to the Lake , 2015, CIDR.

[5]  Rui Pinheiro,et al.  OpenSky report 2016: Facts and figures on SSR mode S and ADS-B usage , 2016, 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC).

[6]  Erton Boci,et al.  A novel big data architecture in support of ADS-B data analytic , 2015, 2015 Integrated Communication, Navigation and Surveillance Conference (ICNS).

[7]  Ralph Kimball,et al.  The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data , 2004 .

[8]  Sandra Geisler,et al.  Constance: An Intelligent Data Lake System , 2016, SIGMOD Conference.

[9]  Ivan Martinovic,et al.  Bringing up OpenSky: A large-scale ADS-B sensor network for research , 2014, IPSN-14 Proceedings of the 13th International Symposium on Information Processing in Sensor Networks.

[10]  Michelle Eshow,et al.  Semantic representation and scale-up of integrated air traffic management data , 2016, SBD '16.

[11]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[12]  Natalia G. Miloslavskaya,et al.  Application of Big Data, Fast Data, and Data Lake Concepts to Information Security Issues , 2016, 2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW).

[13]  Miguel A. Martínez-Prieto,et al.  Towards a Scalable Architecture for Flight Data Management , 2017, DATA.