论文信息 - Practical GAN-based synthetic IP header trace generation using NetShare

Practical GAN-based synthetic IP header trace generation using NetShare

We explore the feasibility of using Generative Adversarial Networks (GANs) to automatically learn generative models to generate synthetic packet- and flow header traces for networking tasks (e.g., telemetry, anomaly detection, provisioning). We identify key fidelity, scalability, and privacy challenges and tradeoffs in existing GAN-based approaches. By synthesizing domain-specific insights with recent advances in machine learning and privacy, we identify design choices to tackle these challenges. Building on these insights, we develop an end-to-end framework, NetShare. We evaluate NetShare on six diverse packet header traces and find that: (1) across all distributional metrics and traces, it achieves 46% more accuracy than baselines and (2) it meets users' requirements of downstream tasks in evaluating accuracy and rank ordering of candidate approaches.

[1] Abhradeep Thakurta,et al. Toward Training at ImageNet Scale with Differential Privacy , 2022, ArXiv.

[2] Huseyin A. Inan,et al. Differentially Private Fine-tuning of Language Models , 2021, ICLR.

[3] Zhi Xue,et al. IDSGAN: Generative Adversarial Networks for Attack Generation against Intrusion Detection , 2018, PAKDD.

[4] Vyas Sekar,et al. On the Privacy Properties of GAN-generated Samples , 2022, AISTATS.

[5] Nour Moustafa,et al. A new distributed architecture for evaluating AI-based security systems at the edge: Network TON_IoT datasets , 2021 .

[6] Mahdi Soltanolkotabi,et al. Understanding Overparameterization in Generative Adversarial Networks , 2021, ICLR.

[7] Zhiwei Steven Wu,et al. Leveraging Public Data for Practical Private Query Release , 2021, ICML.

[8] Ninghui Li,et al. PrivSyn: Differentially Private Data Synthesis , 2020, USENIX Security Symposium.

[9] Nick Feamster,et al. New Directions in Automated Traffic Analysis , 2020, CCS.

[10] Sudsanguan Ngamsuriyaroj,et al. Novel Bi-directional Flow-based Traffic Generation Framework for IDS Evaluation and Exploratory Data Analysis , 2021, J. Inf. Process..

[11] Minlan Yu,et al. Jaqen: A High-Performance Switch-Native Approach for Detecting and Mitigating Volumetric DDoS Attacks with Programmable Switches , 2021, USENIX Security Symposium.

[12] Osu Nrotc,et al. Harpoon , 2021, Encyclopedic Dictionary of Archaeology.

[13] Manish Marwah,et al. STAN: Synthetic Network Traffic Generation using Autoregressive Neural Models , 2020, ArXiv.

[14] Thomas Steinke,et al. New Oracle-Efficient Algorithms for Private Synthetic Data Release , 2020, ICML.

[15] Nick Feamster,et al. A Comparative Study of Network Traffic Representations for Novelty Detection , 2020, ArXiv.

[16] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[17] Mohammad Ashiqur Rahman,et al. G-IDS: Generative Adversarial Networks Assisted Intrusion Detection System , 2020, 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC).

[18] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.

[19] Raef Bassily,et al. Private Query Release Assisted by Public Data , 2020, ICML.

[20] Steffen Haas,et al. Zeek-Osquery: Host-Network Correlation for Advanced Monitoring and Intrusion Detection , 2020, SEC.

[21] Tero Karras,et al. Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Pan Wang,et al. PacketCGAN: Exploratory Study of Class Imbalance for Encrypted Traffic Classification Using CGAN , 2019, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[23] G. Fanti,et al. Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions , 2019, Internet Measurement Conference.

[24] Nick Feamster,et al. Inferring Streaming Video Quality from Encrypted Traffic: Practical Models and Deployment Experience , 2019, SIGMETRICS Perform. Evaluation Rev..

[25] Ashish Khetan,et al. PacGAN: The Power of Two Samples in Generative Adversarial Networks , 2017, IEEE Journal on Selected Areas in Information Theory.

[26] Changhee Choi,et al. PcapGAN: Packet Capture File Generator by Style-Based Generative Adversarial Networks , 2019, 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA).

[27] Adriel Cheng,et al. PAC-GAN: Packet Generation of Network Traffic using Generative Adversarial Networks , 2019, 2019 IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON).

[28] Radu State,et al. SynGAN: Towards Generating Synthetic Network Attacks using GANs , 2019, ArXiv.

[29] Roy Friedman,et al. Nitrosketch: robust and general sketch-based monitoring in software switches , 2019, SIGCOMM.

[30] Yang Song,et al. Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.

[31] Kuang-Ching Wang,et al. The Design and Operation of CloudLab , 2019, USENIX ATC.

[32] Lei Xu,et al. Modeling Tabular data using Conditional GAN , 2019, NeurIPS.

[33] Yiqiang Sheng,et al. A Packet-Length-Adjustable Attention Model Based on Bytes Embedding Using Flow-WGAN for Smart Cybersecurity , 2019, IEEE Access.

[34] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Andreas Hotho,et al. Flow-based Network Traffic Generation using Generative Adversarial Networks , 2018, Comput. Secur..

[36] Úlfar Erlingsson,et al. The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks , 2018, USENIX Security Symposium.

[37] Emiliano De Cristofaro,et al. LOGAN: Membership Inference Attacks Against Generative Models , 2017, Proc. Priv. Enhancing Technol..

[38] Mihaela van der Schaar,et al. Time-series Generative Adversarial Networks , 2019, NeurIPS.

[39] Lingyu Wang,et al. Preserving Both Privacy and Utility in Network Trace Anonymization , 2018, CCS.

[40] Junhua Yan,et al. Feature Selection for Website Fingerprinting , 2018, Proc. Priv. Enhancing Technol..

[41] Peng Liu,et al. Elastic sketch: adaptive and fast network-wide measurements , 2018, SIGCOMM.

[42] Shengli Liu,et al. An enhancing framework for botnet detection using generative adversarial networks , 2018, 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD).

[43] Maria Rigaki,et al. Bringing a GAN to a Knife-Fight: Adapting Malware Communication to Avoid Detection , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[44] Roberto Therón,et al. UGR'16: A new dataset for the evaluation of cyclostationarity-based network IDSs , 2018, Comput. Secur..

[45] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .

[46] Andreas Hotho,et al. IP2Vec: Learning Similarities Between IP Addresses , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[47] Xin Jin,et al. SketchVisor: Robust Network Measurement for Software Packet Processing , 2017, SIGCOMM.

[48] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[49] Yi Zhang,et al. Do GANs actually learn the distribution? An empirical study , 2017, ArXiv.

[50] Gunnar Rätsch,et al. Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs , 2017, ArXiv.

[51] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[52] Andreas Hotho,et al. Flow-based benchmark data sets for intrusion detection , 2017 .

[53] Vladimir Braverman,et al. One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon , 2016, SIGCOMM.

[54] Ian Goodfellow,et al. Deep Learning with Differential Privacy , 2016, CCS.

[55] Alex C. Snoeren,et al. Inside the Social Network's (Datacenter) Network , 2015, Comput. Commun. Rev..

[56] Surya Ganguli,et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.

[57] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[58] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[59] Aaron Roth,et al. The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[60] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[61] David A. Maltz,et al. Network traffic characteristics of data centers in the wild , 2010, IMC '10.

[62] Ratul Mahajan,et al. Differentially-private network trace analysis , 2010, SIGCOMM '10.

[63] Amin Vahdat,et al. Swing: Realistic and Responsive Network Traffic Generation , 2009, IEEE/ACM Transactions on Networking.

[64] Bruno Baynat,et al. LiTGen, a Lightweight Traffic Generator: Application to P2P and Mail Wireless Traffic , 2007, PAM.

[65] Michele C. Weigle,et al. Tmix: a tool for generating realistic TCP application workloads in ns-2 , 2006, CCRV.

[66] Tristan Henderson,et al. CRAWDAD: a community resource for archiving wireless data at Dartmouth , 2005, CCRV.

[67] Graham Cormode,et al. An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[68] Sebastian Zander,et al. KUTE A high performance Kernel-based UDP traffic engine , 2005 .

[69] Paul Barford,et al. Harpoon: a flow-level traffic generator for router and network tests , 2004, SIGMETRICS '04/Performance '04.

[70] Moses Charikar,et al. Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[71] Francisco Chinchilla. Self-similarity in network traffic , 2002 .

[72] Vern Paxson,et al. Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.