A One-Class Decision Tree Based on Kernel Density Estimation

One-Class Classification (OCC) is a domain of machine learning which achieves training by means of a single class sample. The present work aims at developing a one-class model which addresses concerns of both performance and readability. To this end, we propose a hybrid OCC method which relies on density estimation as part of a tree-based learning algorithm. Within a greedy and recursive approach, our proposal rests on kernel density estimation to split a data subset on the basis of one or several intervals of interest. Our method shows favorable performance in comparison with common methods of the literature on a range of benchmark datasets.

[1]  Sanjay Chawla,et al.  SLOM: a new measure for local spatial outliers , 2006, Knowledge and Information Systems.

[2]  R. Bellman Dynamic programming. , 1957, Science.

[3]  Songfeng Zheng,et al.  A fast iterative algorithm for support vector data description , 2019, Int. J. Mach. Learn. Cybern..

[4]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[5]  L. Hawk,et al.  Reaction Time Variability in ADHD: A Review , 2012, Neurotherapeutics.

[6]  Hwanjo Yu,et al.  Single-Class Classification with Mapping Convergence , 2005, Machine Learning.

[7]  Philippe Fortemps,et al.  Towards interpretable machine learning models for diagnosis aid: A case study on attention deficit/hyperactivity disorder , 2019, PloS one.

[8]  Hongjun Lu,et al.  Finding centric local outliers in categorical/numerical spaces , 2006, Knowledge and Information Systems.

[9]  Sebastián Maldonado,et al.  Robust classification of imbalanced data using one-class and two-class SVM-based multiclassifiers , 2014, Intell. Data Anal..

[10]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[11]  M. D’Esposito,et al.  The Variability of Human, BOLD Hemodynamic Responses , 1998, NeuroImage.

[12]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[13]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[14]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[15]  Jim Esch Computational Intelligence Methods For Rule-Based Data Understanding , 2004, Proc. IEEE.

[16]  Vishal M. Patel,et al.  One-Class Convolutional Neural Network , 2019, IEEE Signal Processing Letters.

[17]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD 2000.

[18]  Doo-Hwan Bae,et al.  An Approach to Outlier Detection of Software Measurement Data using the K-means Clustering Method , 2007, ESEM 2007.

[19]  Shinichi Nakajima,et al.  Support Vector Data Descriptions and $k$ -Means Clustering: One Class? , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Michael Brady,et al.  Novelty detection for the identification of masses in mammograms , 1995 .

[21]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[22]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[23]  Tingquan Deng,et al.  An Adaptive Weighted One-Class SVM for Robust Outlier Detection , 2016 .

[24]  João Pedro Hespanha,et al.  One-class slab support vector machine , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[25]  Mikhail F. Kanevski,et al.  A Comparison of One-Class Classifiers for Novelty Detection in Forensic Case Data , 2007, IDEAL.

[26]  Damla Arifoglu,et al.  Detection of abnormal behaviour for dementia sufferers using Convolutional Neural Networks , 2019, Artif. Intell. Medicine.

[27]  Shehroz S. Khan,et al.  A Survey of Recent Trends in One Class Classification , 2009, AICS.

[28]  Philippe Fortemps,et al.  A multi-level classification framework for multi-site medical data: Application to the ADHD-200 collection , 2018, Expert Syst. Appl..

[29]  Nathalie Japkowicz,et al.  Active Learning for One-Class Classification , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[30]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[31]  Philippe Fortemps,et al.  Specifics of medical data mining for diagnosis aid: A survey , 2019, Expert Syst. Appl..

[32]  Riadh Ksantini,et al.  A novel incremental one-class support vector machine based on low variance direction , 2019, Pattern Recognit..

[33]  Bartosz Krawczyk,et al.  Clustering-based ensembles for one-class classification , 2014, Inf. Sci..

[34]  Ada Wai-Chee Fu,et al.  Enhancements on local outlier detection , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[35]  Alexander Binder,et al.  Deep One-Class Classification , 2018, ICML.

[36]  N. Tzourio-Mazoyer,et al.  Automated Anatomical Labeling of Activations in SPM Using a Macroscopic Anatomical Parcellation of the MNI MRI Single-Subject Brain , 2002, NeuroImage.

[37]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[38]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[39]  Nils-Bastian Heidenreich,et al.  Bandwidth Selection Methods for Kernel Density Estimation - A Review of Performance , 2010 .

[40]  M. M. Moya,et al.  One-class classifier networks for target recognition applications , 1993 .

[41]  A. Rama Mohan Reddy,et al.  A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method , 2016, Pattern Recognit..

[42]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[43]  Michal Daszykowski,et al.  Revised DBSCAN algorithm to cluster data with dense adjacent clusters , 2013 .

[44]  Shehroz S. Khan,et al.  One-class classification: taxonomy of study and review of techniques , 2013, The Knowledge Engineering Review.

[45]  M. Milham,et al.  The ADHD-200 Consortium: A Model to Advance the Translational Potential of Neuroimaging in Clinical Neuroscience , 2012, Front. Syst. Neurosci..

[46]  Tuan Hoang,et al.  Multi-sphere support vector data description for brain-computer interface , 2012, 2012 Fourth International Conference on Communications and Electronics (ICCE).

[47]  Dries F. Benoit,et al.  From one-class to two-class classification by incorporating expert knowledge: Novelty detection in human behaviour , 2020, Eur. J. Oper. Res..

[48]  Sanjay Chawla,et al.  Anomaly Detection using One-Class Neural Networks , 2018, ArXiv.

[49]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[50]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[51]  Defeng Wang,et al.  Structured One-Class Classification , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[52]  Jieping Ye,et al.  A Small Sphere and Large Margin Approach for Novelty Detection Using Training Data with Outliers , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[54]  Ian H. Witten,et al.  One-Class Classification by Combining Density and Class Probability Estimation , 2008, ECML/PKDD.

[55]  Caroline Petitjean,et al.  One class random forests , 2013, Pattern Recognit..

[56]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[57]  S. Debener,et al.  Default-mode brain dysfunction in mental disorders: A systematic review , 2009, Neuroscience & Biobehavioral Reviews.

[58]  Bingyang Li,et al.  Dboost: A Fast Algorithm for DBSCAN-based Clustering on High Dimensional Data , 2016, PAKDD.

[59]  Alexander G. Gray,et al.  Retrofitting Decision Tree Classifiers Using Kernel Density Estimation , 1995, ICML.

[60]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[61]  Haiyan Wang,et al.  Improving accuracy for cancer classification with a new algorithm for genes selection , 2012, BMC Bioinformatics.

[62]  Michael Tsang,et al.  Can I trust you more? Model-Agnostic Hierarchical Explanations , 2018, ArXiv.

[63]  Daniel S. Margulies,et al.  The Neuro Bureau ADHD-200 Preprocessed repository , 2016, NeuroImage.

[64]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[65]  Alessandra Retico,et al.  One-Class Support Vector Machines Identify the Language and Default Mode Regions As Common Patterns of Structural Alterations in Young Children with Autism Spectrum Disorders , 2016, Front. Neurosci..

[66]  Qiang Liu,et al.  Hyperparameter selection of one-class support vector machine by self-adaptive data shifting , 2018, Pattern Recognit..

[67]  Francisco Herrera,et al.  On the usefulness of one-class classifier ensembles for decomposition of multi-class problems , 2015, Pattern Recognit..

[68]  M. C. Jones,et al.  A Brief Survey of Bandwidth Selection for Density Estimation , 1996 .

[69]  Xindong Wu,et al.  Multi-sphere Support Vector Data Description for Outliers Detection on Multi-distribution Data , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[70]  Ahmad Lotfi,et al.  A Consensus Novelty Detection Ensemble Approach for Anomaly Detection in Activities of Daily Living , 2019, Appl. Soft Comput..

[71]  Trung Le,et al.  A Theoretical Framework for Multi-sphere Support Vector Data Description , 2010, ICONIP.

[72]  Chu Zhang,et al.  Fault diagnosis based on a novel weighted support vector data description with fuzzy adaptive threshold decision , 2018, Trans. Inst. Meas. Control.

[73]  Christian Biemann,et al.  What do we need to build explainable AI systems for the medical domain? , 2017, ArXiv.

[74]  Krzysztof J. Cios,et al.  Rule-based OneClass-DS learning algorithm , 2015, Appl. Soft Comput..

[75]  Nils-Bastian Heidenreich,et al.  Bandwidth selection for kernel density estimation: a review of fully automatic selectors , 2013, AStA Advances in Statistical Analysis.

[76]  Maël Chiapino,et al.  One Class Splitting Criteria for Random Forests , 2016, ACML.

[77]  En Zhu,et al.  Ensemble One-Class Extreme Learning Machine Based on Overlapping Data Partition , 2016, ICCSIP.

[78]  Qi Li,et al.  Nonparametric Econometrics: Theory and Practice , 2006 .

[79]  Johannes Gehrke,et al.  Accurate intelligible models with pairwise interactions , 2013, KDD.