One-class classifiers based on entropic spanning graphs

One-class classifiers offer valuable tools to assess the presence of outliers in data. In this paper, we propose a design methodology for one-class classifiers based on entropic spanning graphs. Our approach also takes into account the possibility to process nonnumeric data by means of an embedding procedure. The spanning graph is learned on the embedded input data, and the outcoming partition of vertices defines the classifier. The final partition is derived by exploiting a criterion based on mutual information minimization. Here, we compute the mutual information by using a convenient formulation provided in terms of the -Jensen difference. Once training is completed, in order to associate a confidence level with the classifier decision, a graph-based fuzzy model is constructed. The fuzzification process is based only on topological information of the vertices of the entropic spanning graph. As such, the proposed one-class classifier is suitable also for data characterized by complex geometric structures. We provide experiments on well-known benchmarks containing both feature vectors and labeled graphs. In addition, we apply the method to the protein solubility recognition problem by considering several representations for the input samples. Experimental results demonstrate the effectiveness and versatility of the proposed method with respect to other state-of-the-art approaches.

[1]  Pablo A. Estévez,et al.  A review of feature selection methods based on mutual information , 2013, Neural Computing and Applications.

[2]  Nenad Tomašev,et al.  Hubness-based fuzzy measures for high-dimensional k-nearest neighbor classification , 2014 .

[3]  W. Bastiaan Kleijn,et al.  Feature Selection Under a Complexity Constraint , 2009, IEEE Transactions on Multimedia.

[4]  Lorenzo Livi,et al.  Modeling and recognition of smart grid faults by a combined approach of dissimilarity learning and one-class classification , 2014, Neurocomputing.

[5]  Francisco Escolano,et al.  Information-theoretic selection of high-dimensional spectral features for structural recognition , 2013, Comput. Vis. Image Underst..

[6]  Alfred O. Hero,et al.  Asymptotic theory of greedy approximations to minimal k-point random graphs , 1999, IEEE Trans. Inf. Theory.

[7]  Andrea Marino,et al.  Fast and Simple Computation of Top-k Closeness Centralities , 2015, ArXiv.

[8]  Alfred O. Hero,et al.  Weighted k-NN graphs for Rényi entropy estimation in high dimensions , 2011, 2011 IEEE Statistical Signal Processing Workshop (SSP).

[9]  Alessandro Giuliani,et al.  Characterization of Graphs for Protein Structure Modeling and Recognition of Solubility , 2014, ArXiv.

[10]  Martin Rosvall,et al.  An information-theoretic framework for resolving community structure in complex networks , 2007, Proceedings of the National Academy of Sciences.

[11]  Alexander Kraskov,et al.  Published under the scientific responsability of the EUROPEAN PHYSICAL SOCIETY Incorporating , 2002 .

[12]  Chris Wiggins,et al.  An Information-Theoretic Derivation of Min-Cut-Based Clustering , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  K. Dill,et al.  The Protein-Folding Problem, 50 Years On , 2012, Science.

[14]  Lorenzo Livi,et al.  The graph matching problem , 2012, Pattern Analysis and Applications.

[15]  E. Ziv,et al.  Information-theoretic approach to network modularity. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Lorenzo Livi,et al.  One-class classification through mutual information minimization , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[17]  Franck Dufrenois,et al.  A One-Class Kernel Fisher Criterion for Outlier Detection , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[18]  Shoji Takada,et al.  Bimodal protein solubility distribution revealed by an aggregation analysis of the entire ensemble of Escherichia coli proteins , 2009, Proceedings of the National Academy of Sciences.

[19]  Yousef Saad,et al.  Fast Approximate kNN Graph Construction for High Dimensional Data via Recursive Lanczos Bisection , 2009, J. Mach. Learn. Res..

[20]  Alfred O. Hero,et al.  Determining Intrinsic Dimension and Entropy of High-Dimensional Shape Spaces , 2006, Statistics and Analysis of Shapes.

[21]  Alfred O. Hero,et al.  Applications of entropic spanning graphs , 2002, IEEE Signal Process. Mag..

[22]  José Carlos Príncipe,et al.  Information Theoretic Clustering , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Robert P. W. Duin,et al.  The Dissimilarity Representation for Pattern Recognition - Foundations and Applications , 2005, Series in Machine Perception and Artificial Intelligence.

[24]  Susanne Still,et al.  Information Bottleneck Approach to Predictive Inference , 2014, Entropy.

[25]  Lei Liu,et al.  Feature selection with dynamic mutual information , 2009, Pattern Recognit..

[26]  Alexandros Nanopoulos,et al.  Reverse Nearest Neighbors in Unsupervised Distance-Based Outlier Detection , 2015, IEEE Transactions on Knowledge and Data Engineering.

[27]  Jose A. Costa,et al.  Manifold learning using Euclidean k-nearest neighbor graphs [image processing examples] , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[28]  Alfredo Colosimo,et al.  Nonlinear signal analysis methods in the elucidation of protein sequence-structure relationships. , 2002, Chemical reviews.

[29]  Liang Zhao,et al.  A nonparametric classification method based on K-associated graphs , 2011, Inf. Sci..

[30]  Bala Srinivasan,et al.  AnyNovel: detection of novel concepts in evolving data streams , 2016, Evolving Systems.

[31]  Chunguang Li,et al.  Distributed Information Theoretic Clustering , 2014, IEEE Transactions on Signal Processing.

[32]  Lorenzo Livi,et al.  Optimized dissimilarity space embedding for labeled graphs , 2014, Inf. Sci..

[33]  Peter Tiño,et al.  Indefinite Proximity Learning: A Review , 2015, Neural Computation.

[34]  Jan Kybic,et al.  Approximate all nearest neighbor search for high dimensional entropy estimation for image registration , 2012, Signal Process..

[35]  Vir V. Phoha,et al.  On the Feature Selection Criterion Based on an Approximation of Multidimensional Mutual Information , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Dan Stowell,et al.  Fast Multidimensional Entropy Estimation by $k$-d Partitioning , 2009, IEEE Signal Processing Letters.

[37]  Michele Vendruscolo,et al.  Sequence-based prediction of protein solubility. , 2012, Journal of molecular biology.

[38]  Mateu Sbert,et al.  Image registration by compression , 2010, Inf. Sci..

[39]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[40]  Peter G Wolynes,et al.  Evolution, energy landscapes and the paradoxes of protein folding. , 2015, Biochimie.

[41]  Dunja Mladenic,et al.  Hubness-based fuzzy measures for high-dimensional k-nearest neighbor classification , 2011, International Journal of Machine Learning and Cybernetics.

[42]  Kaspar Riesen,et al.  IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning , 2008, SSPR/SPR.

[43]  F. Dufrenois,et al.  One class proximal support vector machines , 2016, Pattern Recognit..

[44]  Dunja Mladenic,et al.  The Role of Hubness in Clustering High-Dimensional Data , 2011, IEEE Transactions on Knowledge and Data Engineering.

[45]  Edwin R. Hancock,et al.  Spherical and Hyperbolic Embeddings of Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Barnabás Póczos,et al.  Estimation of Renyi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs , 2010, NIPS.

[47]  Edwin R. Hancock,et al.  Geometric characterization and clustering of graphs using heat kernel embeddings , 2010, Image Vis. Comput..

[48]  Alessandro Giuliani,et al.  Toward a Multilevel Representation of Protein Molecules: Comparative Approaches to the Aggregation/Folding Propensity Problem , 2014, Inf. Sci..

[49]  Alfred O. Hero,et al.  Image matching using alpha-entropy measures and entropic graphs , 2005, Signal Process..

[50]  Piet Van Mieghem,et al.  Hierarchical clustering in minimum spanning trees. , 2015, Chaos.

[51]  Badong Chen,et al.  System Parameter Identification: Information Criteria and Algorithms , 2013 .

[52]  Shamim Nemati,et al.  Semisupervised ECG Ventricular Beat Classification With Novelty Detection Based on Switching Kalman Filters , 2015, IEEE Transactions on Biomedical Engineering.

[53]  Jacob Goldberger,et al.  Pairwise clustering based on the mutual-information criterion , 2016, Neurocomputing.

[54]  David A. Clifton,et al.  A review of novelty detection , 2014, Signal Process..

[55]  Robert Jenssen,et al.  Information theoretic clustering using a k-nearest neighbors approach , 2014, Pattern Recognit..

[56]  Mert R. Sabuncu,et al.  Using Spanning Graphs for Efficient Image Registration , 2008, IEEE Transactions on Image Processing.

[57]  Manuel Roveri,et al.  Exploiting self-similarity for change detection , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[58]  Witold Pedrycz,et al.  Entropic One-Class Classifiers , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[59]  Naftali Tishby,et al.  Multivariate Information Bottleneck , 2001, Neural Computation.

[60]  Jacek M. Zurada,et al.  Normalized Mutual Information Feature Selection , 2009, IEEE Transactions on Neural Networks.

[61]  Cesare Alippi,et al.  Hierarchical Change-Detection Tests , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[62]  Alfred O. Hero,et al.  Geodesic entropic graphs for dimension and entropy estimation in manifold learning , 2004, IEEE Transactions on Signal Processing.