Detecting Insider Threats Using RADISH: A System for Real-Time Anomaly Detection in Heterogeneous Data Streams

We present a scalable system for high-throughput real-time analysis of heterogeneous data streams. Our architecture enables incremental development of models for predictive analytics and anomaly detection as data arrives into the system. In contrast with batch data-processing systems, such as Hadoop, that can have high latency, our architecture allows for ingest and analysis of data on the fly, thereby detecting and responding to anomalous behavior in near real time. This timeliness is important for applications such as insider threat, financial fraud, and network intrusions. We demonstrate an application of this system to the problem of detecting insider threats, namely, the misuse of an organization's resources by users of the system and present results of our experiments on a publicly available insider threat dataset.