PIF: Anomaly detection via preference embedding

We address the problem of detecting anomalies with respect to structured patterns. To this end, we conceive a novel anomaly detection method called PIF, that combines the advantages of adaptive isolation methods with the flexibility of preference embedding. Specifically, we propose to embed the data in a high dimensional space where an efficient tree-based method, PI-Forest, is employed to compute an anomaly score. Experiments on synthetic and real datasets demonstrate that PIF favorably compares with state-of-the-art anomaly detection techniques, and confirm that PI-Forest is better at measuring arbitrary distances and isolate points in the preference space.

[1]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[2]  Hans-Peter Kriegel,et al.  LoOP: local outlier probabilities , 2009, CIKM.

[3]  Manuele Bicego,et al.  A Novel Anomaly Score for Isolation Forests , 2019, ICIAP.

[4]  Minsu Cho,et al.  Feature correspondence and deformable object matching via agglomerative correspondence clustering , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5]  Erkki Oja,et al.  A new curve detection method: Randomized Hough transform (RHT) , 1990, Pattern Recognit. Lett..

[6]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[7]  Robert J. Brunner,et al.  Extended Isolation Forest , 2018, IEEE Transactions on Knowledge and Data Engineering.

[8]  Giacomo Boracchi,et al.  Online anomaly detection for long-term ECG monitoring using wearable devices , 2019, Pattern Recognit..

[9]  Amy Loutfi,et al.  Data Mining for Wearable Sensors in Health Monitoring Systems: A Review of Recent Trends and Challenges , 2013, Sensors.

[10]  Zengyou He,et al.  Discovering cluster-based local outliers , 2003, Pattern Recognit. Lett..

[11]  Zhi-Hua Zhou,et al.  Overcoming Key Weaknesses of Distance-based Neighbourhood Methods using a Data Dependent Dissimilarity Measure , 2016, KDD.

[12]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[13]  Torsten Sattler,et al.  SCRAMSAC: Improving RANSAC's efficiency with a spatial consistency filter , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  David Suter,et al.  Robust adaptive-scale parametric model estimation for computer vision , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Giacomo Boracchi,et al.  Defect Detection in SEM Images of Nanofibrous Materials , 2017, IEEE Transactions on Industrial Informatics.

[16]  Cesare Alippi,et al.  Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Charu C. Aggarwal,et al.  Outlier Detection with Autoencoder Ensembles , 2017, SDM.

[18]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[19]  M. V. Wilkes,et al.  The Art of Computer Programming, Volume 3, Sorting and Searching , 1974 .

[20]  Andrea Fusiello,et al.  Multiple Models Fitting as a Set Coverage Problem , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Christos Faloutsos,et al.  LOCI: fast outlier detection using the local correlation integral , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[22]  Michal Perdoch,et al.  Efficient sequential correspondence selection by cosegmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Andrea Fusiello,et al.  Structure-and-motion pipeline on a hierarchical cluster tree , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[24]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[25]  Roberto Brunelli,et al.  Template Matching Techniques in Computer Vision: Theory and Practice , 2009 .

[26]  Oleg O. Sushkov,et al.  Local image feature matching for object recognition , 2010, 2010 11th International Conference on Control Automation Robotics & Vision.

[27]  Andrea Fusiello,et al.  T-Linkage: A Continuous Relaxation of J-Linkage for Multi-model Fitting , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Tat-Jun Chin,et al.  Dynamic and hierarchical multi-structure geometric model fitting , 2011, 2011 International Conference on Computer Vision.

[29]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Andrea Fusiello,et al.  Robust Multiple Structures Estimation with J-Linkage , 2008, ECCV.

[31]  Md. Rafiqul Islam,et al.  A survey of anomaly detection techniques in financial domain , 2016, Future Gener. Comput. Syst..

[32]  Robert Bronte,et al.  Information Theoretic Anomaly Detection Framework for Web Application , 2016, 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC).

[33]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[34]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[35]  Vipin Kumar,et al.  Anomaly Detection for Discrete Sequences: A Survey , 2012, IEEE Transactions on Knowledge and Data Engineering.

[36]  Bianca Zadrozny,et al.  Outlier detection by active learning , 2006, KDD '06.

[37]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[38]  Nenad Stojanovic,et al.  Big-data-driven anomaly detection in industry (4.0): An approach and a case study , 2016, 2016 IEEE International Conference on Big Data (Big Data).