Uncovering insider threats from the digital footprints of individuals

We present a system to detect anomalous and ultimately malevolent behavior of people from their digital footprint within an institution. Tripwire approaches based on single features cannot adequately distinguish between normal unpredictable activities and truly counterproductive behavior. For example, a sequence of copying and sending small amounts of data can easily elude a pure single-feature tripwire approach. Here, we combine semantic knowledge with data mining methods. Our system uses a multi-layer architecture in which many aspects of a person's behavior are quantified and then fused using a large-scale anomaly detection Markovian Bayesian network. Evaluation results are based on data for 5,500 assumed to be non-malicious people collected from their activities on their workstations inside a corporation. An outside team augmented this data, with some of the 5,500 individuals (the perpetrators) acting in a malicious fashion. Our system represents the 5,500 people in a ranked list, with people most likely to act maliciously at the top. Our system identifies the perpetrators within the top 2% of the ranked list, while a purely statistical method ranks them within the top 25%. Our scalable infrastructure allows for plug-and-play of different analytics and maintains provenance of results.