A Synthetic Dataset for 5G UAV Attacks Based on Observable Network Parameters

— Synthetic datasets are beneficial for machine learning researchers due to the possibility of experimenting with new strategies and algorithms in the training and testing phases. These datasets can easily include more scenarios that might be costly to research with real data or can complement and, in some cases, replace real data measurements, depending on the quality of the synthetic data. They can also solve the unbalanced data problem, avoid overfitting, and can be used in training while testing can be done with real data. In this paper, we present, to the best of our knowledge, the first synthetic dataset for Unmanned Aerial Vehicle (UAV) attacks in 5G and beyond networks based on the following key observable network parameters that indicate power levels: the Received Signal Strength Indicator (RSSI) and the Signal to Interference-plus-Noise Ratio (SINR). The main objective of this data is to enable deep network development for UAV communication security. Especially, for algorithm development or the analysis of time-series data applied to UAV attack recognition. Our proposed dataset provides insights into network functionality when static or moving UAV attackers target authenticated UAVs in an urban environment. The dataset also considers the presence and absence of authenticated terrestrial users in the network, which may decrease the deep network’s ability to identify attacks. Furthermore, the data provides deeper comprehension of the metrics available in the 5G physical and MAC layers for machine learning and statistics research. The dataset will available at link https://archive-beta.ics.uci.edu/

[1]  R. Dinis,et al.  A Convolutional Attention Based Deep Network Solution for UAV Network Attack Recognition over Fading Channels and Interference , 2022, ArXiv.

[2]  Zoraze Ali,et al.  Calibration of the 5G-LENA System Level Simulator in 3GPP reference scenarios , 2022, Simul. Model. Pract. Theory.

[3]  Zoraze Ali,et al.  ns-3 and 5G-LENA Extensions to Support Dual-Polarized MIMO , 2022, WNS3.

[4]  Lorenza Giupponi,et al.  Realistic beamforming design using SRS-based channel estimate for ns-3 5G-LENA module , 2021, WNS3.

[5]  Mihaela van der Schaar,et al.  Synthetic Data: Opening the data floodgates to enable faster, more directed development of machine learning methods , 2020, ArXiv.

[6]  Ashish Kapoor,et al.  BIRDSAI: A Dataset for Detection and Tracking in Aerial Thermal Infrared Videos , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[7]  Sandra Lagen,et al.  New Radio Physical Layer Abstraction for System-Level Simulations of 5G Networks , 2020, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[8]  Natale Patriciello,et al.  An E2E simulator for 5G NR networks , 2019, Simul. Model. Pract. Theory.

[9]  Germain Forestier,et al.  Deep learning for time series classification: a review , 2018, Data Mining and Knowledge Discovery.

[10]  Qi Hao,et al.  Deep Learning for Intelligent Wireless Networks: A Comprehensive Survey , 2018, IEEE Communications Surveys & Tutorials.

[11]  Marcin Korytkowski,et al.  Convolutional Neural Networks for Time Series Classification , 2017, ICAISC.

[12]  L. F. Henderson,et al.  The Statistics of Crowd Fluids , 1971, Nature.

[13]  Ainuddin Wahid Abdul Wahab,et al.  Internet of Drones Security and Privacy Issues: Taxonomy and Open Challenges , 2021, IEEE Access.

[14]  Xukun Shen,et al.  Large-Scale Synthetic Urban Dataset for Aerial Scene Understanding , 2020, IEEE Access.

[15]  Thomas R. Henderson,et al.  Network Simulations with the ns-3 Simulator , 2008 .