A methodology for ensuring fair allocation of CSOC effort for alert investigation

A Cyber Security Operations Center (CSOC) often sells services by entering into a service level agreement (SLA) with various customers (organizations) whose network traffic is monitored through sensors. The sensors produce data that are processed by automated systems (such as the intrusion detection system) that issue alerts. All alerts need further investigation by human analysts. The alerts are triaged into high-, medium-, and low-priority alerts, and the high-priority alerts are investigated first by cybersecurity analysts—a process known as priority queueing. In unexpected situations such as (i) higher than expected high-priority alert generation from some sensors, (ii) not enough analysts at the CSOC in a given time interval, and (iii) a new type of alert, which increases the time to analyze alerts from some sensors, the priority queueing mechanism leads to two major issues. The issues are: (1) some sensors with normal levels of alert generation are being analyzed less than those with excessive high-priority alerts, with the potential for complete starvation of alert analysis for sensors with only medium- or low-priority alerts, and (2) the above ad hoc allocation of CSOC effort to sensors with excessive high-priority alerts over other sensors results in SLA violations, and there is no enforcement mechanism to ensure the matching between the SLA and the actual service provided by a CSOC. This paper develops a new dynamic weighted alert queueing mechanism (DWQ) which relates the CSOC effort as per SLA to the actual allocated in practice, and ensures via a technical enforcement system that the total CSOC effort is proportionally divided among customers such that fairness is guaranteed in the long run. The results indicate that the DWQ mechanism outperforms priority queueing method by not only analyzing high-priority alerts first but also ensuring fairness in CSOC effort allocated to all its customers and providing a starvation-free alert investigation process.

[1]  R. N. D. S. S Kiran,et al.  Optimizing CPU Scheduling for Real Time Applications Using Mean-Difference Round Robin (MDRR) Algorithm , 2014 .

[2]  Ajit Singh,et al.  An Optimized Round Robin Scheduling Algorithm for CPU Scheduling , 2010 .

[3]  Stephen Northcutt,et al.  Network intrusion detection , 2003 .

[4]  Fabio Persia,et al.  Discovering the Top-k Unexplained Sequences in Time-Stamped Observation Data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[5]  Sara Bouchenak,et al.  Automated control for SLA-aware elastic clouds , 2010, FeBiD '10.

[6]  Ioannis Lambadaris,et al.  Max-min Fair scheduling of variable-length packet-flows to multiple servers by deficit round-robin , 2016, 2016 Annual Conference on Information Science and Systems (CISS).

[7]  Sushil Jajodia,et al.  Optimal Scheduling of Cybersecurity Analysts for Minimizing Risk , 2017, ACM Trans. Intell. Syst. Technol..

[8]  Feruza Sattarova Yusufovna,et al.  Implementing Intrusion Detection System against Insider Attacks , 2009 .

[9]  John McHugh,et al.  Turning Contradictions into Innovations or: How We Learned to Stop Whining and Improve Security Operations , 2016, SOUPS.

[10]  Rami Bahsoon,et al.  Engineering Proprioception in SLA Management for Cloud Architectures , 2011, 2011 Ninth Working IEEE/IFIP Conference on Software Architecture.

[11]  Anita D. D'Amico,et al.  The Real Work of Computer Network Defense Analysts , 2007, VizSEC.

[12]  Scott Shenker,et al.  Analysis and simulation of a fair queueing algorithm , 1989, SIGCOMM 1989.

[13]  Robin M. Ruefle,et al.  State of the Practice of Computer Security Incident Response Teams (CSIRTs) , 2003 .

[14]  George Varghese,et al.  Efficient fair queueing using deficit round-robin , 1996, TNET.

[15]  J·帕迪耶,et al.  Controlling fair bandwidth allocation efficiently , 2016 .

[16]  Benjamin Avi-Itzhak,et al.  QUANTIFYING FAIRNESS IN QUEUING SYSTEMS , 2008, Probability in the Engineering and Informational Sciences.

[17]  Vern Paxson,et al.  Outside the Closed World: On Using Machine Learning for Network Intrusion Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[18]  Leslie D. Servi,et al.  A two-stage stochastic program for multi-shift, multi-analyst, workforce optimization with multiple on-call options , 2017, Journal of Scheduling.

[19]  Sushil Jajodia,et al.  Dynamic Scheduling of Cybersecurity Analysts for Minimizing Risk Using Reinforcement Learning , 2016, ACM Trans. Intell. Syst. Technol..

[20]  Robert F. Erbacher,et al.  Extending Case-Based Reasoning to Network Alert Reporting , 2012, 2012 International Conference on Cyber Security.

[21]  Jingping Bi,et al.  A proportional fairness scheduling for wireless sensor networks , 2015, International Conference on Identification, Information, and Knowledge in the Internet of Things.

[22]  Sushil Jajodia,et al.  A methodology to measure and monitor level of operational effectiveness of a CSOC , 2017, International Journal of Information Security.

[23]  Prashant J. Shenoy,et al.  Surplus fair scheduling: a proportional-share CPU scheduling algorithm for symmetric multiprocessors , 2000, OSDI.

[24]  Kwan-Liu Ma,et al.  VizSEC 2007, Proceedings of the Workshop on Visualization for Computer Security, Sacramento, California, USA, October 29, 2007 , 2008, VizSEC.

[25]  Karim Djemame,et al.  Enabling service-level agreement renegotiation through extending WS-Agreement specification , 2014, Service Oriented Computing and Applications.

[26]  Richard Bejtlich,et al.  The Tao of Network Security Monitoring: Beyond Intrusion Detection , 2004 .

[27]  Robert J. Hammell,et al.  Effective prioritization of network intrusion alerts to enhance situational awareness , 2016, 2016 IEEE Conference on Intelligence and Security Informatics (ISI).

[28]  Pratyusa K. Manadhata,et al.  The Operational Role of Security Information and Event Management Systems , 2014, IEEE Security & Privacy.

[29]  John McHugh,et al.  A Human Capital Model for Mitigating Security Analyst Burnout , 2015, SOUPS.