Unified Host and Network Data Set

The lack of data sets derived from operational enterprise networks continues to be a critical deficiency in the cyber security research community. Unfortunately, releasing viable data sets to the larger com- munity is challenging for a number of reasons, primarily the difficulty of balancing security and privacy concerns against the fidelity and utility of the data. This chapter discusses the importance of cyber secu- rity research data sets and introduces a large data set derived from the operational network environment at Los Alamos National Laboratory. The hope is that this data set and associated discussion will act as a catalyst for both new research in cyber security as well as motivation for other organizations to release similar data sets to the community.

[1]  Benoit Claise,et al.  Cisco Systems NetFlow Services Export Version 9 , 2004, RFC.

[2]  Brian Trammell,et al.  Bidirectional Flow Export Using IP Flow Information Export (IPFIX) , 2008, RFC.

[3]  Jan Vykopal,et al.  Improving Host Profiling with Bidirectional Flows , 2009, 2009 International Conference on Computational Science and Engineering.

[4]  Lawrence K. Saul,et al.  Identifying suspicious URLs: an application of large-scale online learning , 2009, ICML '09.

[5]  Aiko Pras,et al.  An Overview of IP Flow-Based Intrusion Detection , 2010, IEEE Communications Surveys & Tutorials.

[6]  Robin Berthier,et al.  Nfsight: netflow-based network awareness tool , 2010 .

[7]  Joshua Glasser,et al.  Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data , 2013, 2013 IEEE Security and Privacy Workshops.

[8]  Rafael Ramos Regis Barbosa,et al.  Anomaly Detection in SCADA Systems - A Network Based Approach , 2014 .

[9]  Patrick Rubin-Delanchy,et al.  Filtering Automated Polling Traffic in Computer Network Flow Data , 2014, 2014 IEEE Joint Intelligence and Security Informatics Conference.

[10]  Alex Kent Anonymized User-Computer Authentication Associations in Time , 2014 .

[11]  Aiko Pras,et al.  Flow Monitoring Explained: From Packet Capture to Data Analysis With NetFlow and IPFIX , 2014, IEEE Communications Surveys & Tutorials.

[12]  Alexander D. Kent,et al.  Cyber security data sources for dynamic network research , 2016 .

[13]  Melissa J. Turcotte,et al.  Detecting Periodic Subsequences in Cyber Security Data , 2017, 2017 European Intelligence and Security Informatics Conference (EISIC).

[14]  Alexander Kent,et al.  Evolving Bipartite Authentication Graph Partitions , 2019, IEEE Transactions on Dependable and Secure Computing.