Generalized isolation forest for anomaly detection

This letter introduces a generalization of Isolation Forest (IF) based on the existing Extended IF (EIF). EIF has shown some interest compared to IF being for instance more robust to some artefacts. However, some information can be lost when computing the EIF trees since the sampled threshold might lead to empty branches. This letter introduces a generalized isolation forest algorithm called Generalized IF (GIF) to overcome these issues. GIF is faster than EIF with a similar performance, as shown in several simulation results associated with reference databases used for anomaly detection. c © 2021 Elsevier Ltd. All rights reserved.

[1]  Rüdiger W. Brause,et al.  Neural data mining for credit card fraud detection , 1999, Proceedings 11th International Conference on Tools with Artificial Intelligence.

[2]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[3]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[4]  Mervin E. Muller,et al.  A note on a method for generating points uniformly on n-dimensional spheres , 1959, CACM.

[5]  Naoki Nishimura,et al.  A Data-Driven Health Monitoring Method for Satellite Housekeeping Data Based on Probabilistic Clustering and Dimensionality Reduction , 2017, IEEE Transactions on Aerospace and Electronic Systems.

[6]  Michael J. V. Leach,et al.  Contextual anomaly detection in crowded surveillance scenes , 2014, Pattern Recognit. Lett..

[7]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[8]  Robert J. Brunner,et al.  Extended Isolation Forest , 2018, IEEE Transactions on Knowledge and Data Engineering.

[9]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[10]  Hans-Peter Kriegel,et al.  LoOP: local outlier probabilities , 2009, CIKM.

[11]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[12]  Nur Evin Özdemirel,et al.  An adaptive neighbourhood construction algorithm based on density and connectivity , 2015, Pattern Recognit. Lett..

[13]  Bonny Banerjee,et al.  Improved outlier detection using sparse coding-based methods , 2019, Pattern Recognit. Lett..

[14]  Jean-Yves Tourneret,et al.  Anomaly detection in mixed telemetry data using a sparse representation and dictionary learning , 2020, Signal Process..