Data-Fusion Techniques for Open-Set Recognition Problems

Most pattern classification techniques are focused on solving closed-set problems in which a classifier is trained with samples of all classes that may appear during the testing phase. In many situations, however, samples of unknown classes, i.e., whose classes did not have any example during the training stage, need to be properly handled during testing. This specific setup is referred to in the literature as open-set recognition. Open-set problems are harder as they might be ill-sampled, not sampled at all, or even undefined. Differently from existing literature, here we aim at solving open-set recognition problems combining different classifiers and features while, at the same time, taking care of unknown classes. Researchers have greatly benefited from combining different methods in order to achieve more robust and reliable classifiers in daring recognition conditions, but those solutions have often focused on closed-set setups. In this paper, we propose the integration of a newly designed open-set graph-based optimum-path forest (OSOPF) classifier with genetic programming (GP) and majority voting fusion techniques. While OSOPF takes care of learning decision boundaries more resilient to unknown classes and outliers, GP combines different problem features to discover appropriate similarity functions and allows a more robust classification through early fusion. Finally, the majority-voting approach combines different classification evidence from different classifier outcomes and features through late-fusion techniques. Performed experiments show the proposed data-fusion approaches yield effective results for open-set recognition problems, significantly outperforming existing counterparts in the literature and paving the way for investigations in this field.

[1]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[2]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[3]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[4]  Quan Liu,et al.  An Orientation Independent Texture Descriptor for Image Retrieval , 2007, 2007 International Conference on Communications, Circuits and Systems.

[5]  Rupert G. Miller Normal Univariate Techniques , 1981 .

[6]  João Paulo Papa,et al.  Automatic Segmentation and Classification of Human Intestinal Parasites From Microscopy Images , 2013, IEEE Transactions on Biomedical Engineering.

[7]  Alexei A. Efros,et al.  Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.

[8]  Arun Ross,et al.  Open Set Fingerprint Spoof Detection Across Novel Fabrication Materials , 2015, IEEE Transactions on Information Forensics and Security.

[9]  Terrance E. Boult,et al.  Probability Models for Open Set Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Brian S. Yandell,et al.  Practical Data Analysis for Designed Experiments , 1998 .

[11]  João Paulo Papa,et al.  Improving Accuracy and Speed of Optimum-Path Forest Classifier Using Combination of Disjoint Training Subsets , 2011, MCS.

[12]  Moacir P. Ponti,et al.  Ensembles of Optimum-Path Forest Classifiers Using Input Data Manipulation and Undersampling , 2013, MCS.

[13]  Ricardo da Silva Torres,et al.  Learning to rank for content-based image retrieval , 2010, MIR '10.

[14]  João Paulo Papa,et al.  A fast large scale iris database classification with Optimum-Path Forest technique: A case study , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[15]  Anderson Rocha,et al.  Multiclass From Binary: Expanding One-Versus-All, One-Versus-One and ECOC-Based Approaches , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Terrance E. Boult,et al.  Detecting and classifying scars, marks, and tattoos found in the wild , 2012, 2012 IEEE Fifth International Conference on Biometrics: Theory, Applications and Systems (BTAS).

[17]  Jurandy Almeida,et al.  Fusion of Local and Global Descriptors for Content-Based Image and Video Retrieval , 2012, CIARP.

[18]  Pierluigi Salvo Rossi,et al.  Decision Fusion With Unknown Sensor Detection Probability , 2013, IEEE Signal Processing Letters.

[19]  Mario A. Nascimento,et al.  A compact and efficient image retrieval approach based on border/interior pixel classification , 2002, CIKM '02.

[20]  Ching Y. Suen,et al.  Application of majority voting to pattern recognition: an analysis of its behavior and performance , 1997, IEEE Trans. Syst. Man Cybern. Part A.

[21]  João Paulo Papa,et al.  Efficient supervised optimum-path forest classification for large datasets , 2012, Pattern Recognit..

[22]  Anderson Rocha,et al.  Open set source camera attribution and device linking , 2014, Pattern Recognit. Lett..

[23]  Peter Nordin,et al.  Using Factorial Experiments to Evaluate the Effect of Genetic Programming Parameters , 2000, EuroGP.

[24]  Jefersson Alex dos Santos,et al.  A relevance feedback method based on genetic programming for classification of remote sensing images , 2011, Inf. Sci..

[25]  Terrance E. Boult,et al.  Towards Open Set Deep Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[27]  Bo Tao,et al.  Texture Recognition and Image Retrieval Using Gradient Indexing , 2000, J. Vis. Commun. Image Represent..

[28]  Thomas S. Huang,et al.  Relevance feedback in image retrieval: A comprehensive review , 2003, Multimedia Systems.

[29]  Ricardo da Silva Torres,et al.  Nearest neighbors distance ratio open-set classifier , 2016, Machine Learning.

[30]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[31]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[32]  S. Coles,et al.  An Introduction to Statistical Modeling of Extreme Values , 2001 .

[33]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[34]  Peter Auer,et al.  Generic object recognition with boosting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Edward A. Fox,et al.  A genetic programming framework for content-based image retrieval , 2009, Pattern Recognit..

[36]  Dawei Song,et al.  On the duality of specific early and late fusion strategies , 2014, 17th International Conference on Information Fusion (FUSION).

[37]  Jefersson Alex dos Santos,et al.  A Genetic Programming Approach for Relevance Feedback in Region-Based Image Retrieval Systems , 2008, 2008 XXI Brazilian Symposium on Computer Graphics and Image Processing.

[38]  Jing Huang,et al.  Image indexing using color correlograms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[39]  Thomas H. Cormen,et al.  Introduction to algorithms [2nd ed.] , 2001 .

[40]  Arnold W. M. Smeulders,et al.  The Amsterdam Library of Object Images , 2004, International Journal of Computer Vision.

[41]  Xin-She Yang,et al.  BCS: A Binary Cuckoo Search algorithm for feature selection , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).

[42]  Harry Wechsler,et al.  Open set face recognition using transduction , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Jefersson Alex dos Santos,et al.  Incorporating multiple distance spaces in optimum-path forest classification to improve feedback-based learning , 2012, Comput. Vis. Image Underst..

[44]  Ricardo da Silva Torres,et al.  Multimodal retrieval with relevance feedback based on genetic programming , 2012, Multimedia Tools and Applications.

[45]  Alexandre X. Falcão,et al.  A new CBIR approach based on relevance feedback and optimum-path forest classification , 2010, J. WSCG.

[46]  Roseli A. Francelin Romero,et al.  Optimum-Path Forest Applied for Breast Masses Classification , 2014, 2014 IEEE 27th International Symposium on Computer-Based Medical Systems.

[47]  Anderson Rocha,et al.  Toward Open Set Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Chandan Srivastava,et al.  Support Vector Data Description , 2011 .

[49]  Terrance E. Boult,et al.  Multi-class Open Set Recognition Using Probability of Inclusion , 2014, ECCV.

[50]  João Paulo Papa,et al.  Supervised pattern classification based on optimum‐path forest , 2009, Int. J. Imaging Syst. Technol..

[51]  Jefersson Alex dos Santos,et al.  Interactive Classification of Remote Sensing Images by Using Optimum-Path Forest and Genetic Programming , 2011, CAIP.

[52]  Jefersson Alex dos Santos,et al.  A Genetic Programming approach for coffee crop recognition , 2010, 2010 IEEE International Geoscience and Remote Sensing Symposium.

[53]  Ramin Zabih,et al.  Comparing images using color coherence vectors , 1997, MULTIMEDIA '96.

[54]  Jurandy Almeida,et al.  Deriving vegetation indices for phenology analysis using genetic programming , 2015, Ecol. Informatics.

[55]  Robert P. W. Duin,et al.  Open Issues in Pattern Recognition , 2005, CORES.

[56]  Anderson Rocha,et al.  Open Set Source Camera Attribution , 2012, 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images.

[57]  Ricardo da Silva Torres,et al.  Evaluation of parameters for combining multiple textual sources of evidence for Web image retrieval using genetic programming , 2012, Journal of the Brazilian Computer Society.

[58]  J. S. Hunter,et al.  Statistics for experimenters : an introduction to design, data analysis, and model building , 1979 .

[59]  Adrian E. Raftery,et al.  Bayesian Model Averaging: A Tutorial , 2016 .

[60]  Jorge Stolfi,et al.  The image foresting transform: theory, algorithms, and applications , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61]  Willian Paraguassu Amorim,et al.  Supervised Learning Using Local Analysis in an Optimal-Path Forest , 2012, 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images.

[62]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[63]  Edward A. Fox,et al.  A relevance feedback approach for the author name disambiguation problem , 2013, JCDL '13.

[64]  Marina L. Gavrilova,et al.  Optimum-Path Forest Classifier for Large Scale Biometric Applications , 2012, 2012 Third International Conference on Emerging Security Technologies.

[65]  Shi-Chun Tsai,et al.  JGAP: a Java‐based graph algorithms platform , 2001, Softw. Pract. Exp..

[66]  Terrance E. Boult,et al.  Towards Open World Recognition , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[68]  Pável Calado,et al.  A combined component approach for finding collection-adapted ranking functions based on genetic programming , 2007, SIGIR.

[69]  Weiguo Fan,et al.  Relevance feedback based on genetic programming for image retrieval , 2011, Pattern Recognit. Lett..

[70]  Efstathios Stamatatos,et al.  Open-Set Classification for Automated Genre Identification , 2013, ECIR.