Reframing Threat Detection: Inside esINSIDER

We describe the motivation and design for esINSIDER, an automated tool that detects potential persistent and insider threats in a network. esINSIDER aggregates clues from log data, over extended time periods, and proposes a small number of cases for human experts to review. The proposed cases package together related information so the analyst can see a bigger picture of what is happening, and their evidence includes internal network activity resembling reconnaissance and data collection. The core ideas are to 1) detect fundamental campaign behaviors by following data movements over extended time periods, 2) link together behaviors associated with different meta-goals, and 3) use machine learning to understand what activities are expected and consistent for each individual network. We call this approach campaign analytics because it focuses on the threat actor's campaign goals and the intrinsic steps to achieve them. Linking different campaign behaviors (internal reconnaissance, collection, exfiltration) reduces false positives from business-as-usual activities and creates opportunities to detect threats before a large exfiltration occurs. Machine learning makes it practical to deploy this approach by reducing the amount of tuning needed.

[1]  John F. Canny,et al.  Large-scale behavioral targeting , 2009, KDD.

[2]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[3]  Koby Crammer,et al.  Confidence-Weighted Linear Classification for Text Categorization , 2012, J. Mach. Learn. Res..

[4]  D. Ruta,et al.  An Overview of Classifier Fusion Methods , 2000 .

[5]  J. Mandel Use of the Singular Value Decomposition in Regression Analysis , 1982 .

[6]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[7]  Barbara Hammer,et al.  Interpretable machine learning with reject option , 2018, Autom..

[8]  Peter G. Neumann,et al.  EMERALD: Event Monitoring Enabling Responses to Anomalous Live Disturbances , 1997, CCS 2002.

[9]  Gideon S. Mann,et al.  Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models , 2009, NIPS.

[10]  T. Liao Interpreting Probability Models: Logit, Probit, and Other Generalized Linear Models , 1994 .

[11]  Ali A. Ghorbani,et al.  IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS 1 Toward Credible Evaluation of Anomaly-Based Intrusion-Detection Methods , 2022 .

[12]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[13]  Stefan Axelsson,et al.  The base-rate fallacy and the difficulty of intrusion detection , 2000, TSEC.

[14]  Wei Li,et al.  Exploitation and exploration in a performance based contextual advertising system , 2010, KDD.

[15]  Brad Boehmke,et al.  Interpretable Machine Learning , 2019 .

[16]  Jianfeng Gao,et al.  Scalable training of L1-regularized log-linear models , 2007, ICML '07.

[17]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[18]  Thomas Hofmann,et al.  Map-Reduce for Machine Learning on Multicore , 2007 .

[19]  Ron Kohavi,et al.  Error-Based and Entropy-Based Discretization of Continuous Features , 1996, KDD.

[20]  Harold S. Javitz,et al.  The SRI IDES statistical anomaly detector , 1991, Proceedings. 1991 IEEE Computer Society Symposium on Research in Security and Privacy.

[21]  Jung-Min Park,et al.  An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.

[22]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1986, 1986 IEEE Symposium on Security and Privacy.

[23]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[24]  T. Minka A comparison of numerical optimizers for logistic regression , 2004 .

[25]  Daniele Micci-Barreca,et al.  A preprocessing scheme for high-cardinality categorical attributes in classification and prediction problems , 2001, SKDD.

[26]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[27]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[28]  Eric Michael Hutchins,et al.  Intelligence-Driven Computer Network Defense Informed by Analysis of Adversary Campaigns and Intrusion Kill Chains , 2010 .

[29]  T. Hesterberg,et al.  Least angle and ℓ1 penalized regression: A review , 2008, 0802.0964.

[30]  R. Snee,et al.  Ridge Regression in Practice , 1975 .

[31]  S. E. Smaha Haystack: an intrusion detection system , 1988, [Proceedings 1988] Fourth Aerospace Computer Security Applications.

[32]  J. A. Battaglia,et al.  Finding Cyber Threats with ATT&CK-Based Analytics , 2017 .

[33]  Weiwei Deng,et al.  Model Ensemble for Click Prediction in Bing Search Ads , 2017, WWW.

[34]  Rukmini Iyer,et al.  The sum of its parts: reducing sparsity in click estimation with query segments , 2011, Information Retrieval.

[35]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[36]  Reynold Xin,et al.  Apache Spark , 2016 .

[37]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.