MultiAspectSpotting: Spotting Anomalous Behavior within Count Data Using Tensor

Methods for finding anomalous behaviors are attracting much attention, especially for very large datasets with several attributes with tens of thousands of categorical values. For example, security engineers try to find anomalous behaviors, i.e., remarkable attacks which greatly differ from the day’s trend of attacks, on the basis of intrusion detection system logs with source IPs, destination IPs, port numbers, and additional information. However, there are large amount of abnormal records caused by noise, which can be repeated more abnormally than those caused by anomalous behaviors, and they are hard to be distinguished from each other. To tackle these difficulties, we propose a two-step anomaly detection. First, we detect abnormal records as individual anomalies by using a statistical anomaly detection, which can be improved by Poisson Tensor Factorization. Next, we gather the individual anomalies into groups of records with similar attribute values, which can be implemented by CANDECOMP/PARAFAC (CP) Decomposition. We conduct experiments using datasets added with synthesized anomalies and prove that our method can spot anomalous behaviors effectively. Moreover, our method can spot interesting patterns within some real world datasets such as IDS logs and web-access logs.