论文信息 - Improved Testing of AI-Based Anomaly Detection Systems Using Synthetic Surveillance Data

Improved Testing of AI-Based Anomaly Detection Systems Using Synthetic Surveillance Data

Major transportation surveillance protocols have not been specified with cyber security in mind and therefore provide no encryption nor identification. These issues expose air and sea transport to false data injection attacks (FDIAs), in which an attacker modifies, blocks or emits fake surveillance messages to dupe controllers and surveillance systems. There has been growing interest in conducting research on machine learning-based anomaly detection systems that address these new threats. However, significant amounts of data are needed to achieve meaningful results with this type of model. Raw, genuine data can be obtained from existing databases but need to be preprocessed before being fed to a model. Acquiring anomalous data is another challenge: such data is much too scarce for both the Automatic Dependent Surveillance–Broadcast (ADS-B) and the Automatic Identification System (AIS). Crafting anomalous data by hand, which has been the sole method applied to date, is hardly suitable for broad detection model testing. This paper proposes an approach built upon existing libraries and ideas that offers ML researchers the necessary tools to facilitate the access and processing of genuine data as well as to automatically generate synthetic anomalous surveillance data to constitute broad, elaborated test datasets. We demonstrate the usability of the approach by discussing work in progress that includes the reproduction of related work, creation of relevant datasets and design of advanced anomaly detection models for both domains of application.