Subspace-Based Dynamic Selection: A Proof of Concept Using Protein Microarray Data

Traditional dynamic selection methods fail to per-form effectively when data dimensionality increases. In addition, those methods do not provide any insights into important features in the data, as the regions of competence used for data classification are always constituted by the same set of features. In this paper, we propose a two-stage framework based on subspace clustering using a Gaussian Based Estimator, followed by a k-Nearest subspace search mechanism to overcome these limitations of dynamic selection. The idea of subspace allows for regions of competence with different numbers of instances and dimension sizes. Our hypothesis is that by using our framework, we will achieve comparable results to the state-of-the-art dynamic selection, with the benefit of producing a model that helps to understand the importance of sets of features for the patterns found within the data. We test our approach to a high dimensional microarray data of insect bite hypersensitivity in horses. Results show that our approach is comparable to traditional dynamic selection methods in terms of accuracy. In addition, it facilitates the interpretability of the feature importance for each class of the dataset.

[1]  Luiz Eduardo Soares de Oliveira,et al.  Dynamic selection of classifiers - A comprehensive review , 2014, Pattern Recognit..

[2]  Lihi Zelnik-Manor,et al.  Approximate Nearest Subspace Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Xiaoyi Jiang,et al.  A dynamic classifier ensemble selection approach for noise data , 2010, Inf. Sci..

[4]  Hans-Peter Kriegel,et al.  Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.

[5]  Fernando José Von Zuben,et al.  Microarray Feature Selection and Dynamic Selection of Classifiers for Early Detection of Insect Bite Hypersensitivity in Horses , 2019, 2019 IEEE Congress on Evolutionary Computation (CEC).

[6]  Fabio Roli,et al.  Dynamic classifier selection based on multiple classifier behaviour , 2001, Pattern Recognit..

[7]  Robert Sabourin,et al.  Dynamic selection approaches for multiple classifier systems , 2011, Neural Computing and Applications.

[8]  Ira Assent,et al.  Evaluating Clustering in Subspace Projections of High Dimensional Data , 2009, Proc. VLDB Endow..

[9]  Donald F. Conrad,et al.  A Protein Allergen Microarray Detects Specific IgE to Pollen Surface, Cytoplasmic, and Commercial Allergen Extracts , 2010, PloS one.

[10]  Fabio Roli,et al.  Methods for dynamic classifier selection , 1999, Proceedings 10th International Conference on Image Analysis and Processing.

[11]  Anne M. P. Canuto,et al.  Using Accuracy and Diversity to Select Classifiers to Build Ensembles , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[12]  G. Figueredo,et al.  Novel in vitro diagnosis of equine allergies using a protein array and mathematical modelling approach: a proof of concept using insect bite hypersensitivity. , 2015, Veterinary immunology and immunopathology.

[13]  Wenjia Wang,et al.  Dynamic ensemble selection methods for heterogeneous data mining , 2016, 2016 12th World Congress on Intelligent Control and Automation (WCICA).

[14]  Huan Liu,et al.  Subspace clustering for high dimensional data: a review , 2004, SKDD.

[15]  Daniel A. Keim,et al.  Subspace Nearest Neighbor Search - Problem Statement, Approaches, and Discussion - Position Paper , 2015, SISAP.

[16]  Marek Kurzynski,et al.  A probabilistic model of classifier competence for dynamic ensemble selection , 2011, Pattern Recognit..

[17]  Kevin W. Bowyer,et al.  Combination of multiple classifiers using local accuracy estimates , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Marek Kurzynski,et al.  On a New Measure of Classifier Competence Applied to the Design of Multiclassifier Systems , 2009, ICIAP.

[19]  Alexandros Nanopoulos,et al.  Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data , 2010, J. Mach. Learn. Res..

[20]  George D. C. Cavalcanti,et al.  Dynamic ensemble selection VS K-NN: Why and when dynamic selection obtains higher classification performance? , 2017, 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA).

[21]  Bartlomiej Antosik,et al.  New Measures of Classifier Competence - Heuristics and Application to the Design of Multiple Classifier Systems , 2011, Computer Recognition Systems 4.

[22]  Marek Kurzynski,et al.  A measure of competence based on random classification for dynamic ensemble selection , 2012, Inf. Fusion.

[23]  Jin Tian,et al.  Subspace Clustering Based on Self-organizing Map , 2019 .

[24]  Michel Verleysen,et al.  The Curse of Dimensionality in Data Mining and Time Series Prediction , 2005, IWANN.

[25]  Robert Sabourin,et al.  From dynamic classifier selection to dynamic ensemble selection , 2008, Pattern Recognit..

[26]  Amar Mitiche,et al.  Classifier combination for hand-printed digit recognition , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[27]  Sharmila. M. Shinde,et al.  A Survey on Ensemble Methods for High Dimensional Data Classification in Biomedicine Field , 2015 .

[28]  Fei Wang,et al.  Deep learning for healthcare: review, opportunities and challenges , 2018, Briefings Bioinform..

[29]  George D. C. Cavalcanti,et al.  Dynamic classifier selection: Recent advances and perspectives , 2018, Inf. Fusion.

[30]  Ludmila I. Kuncheva,et al.  Switching between selection and fusion in combining classifiers: an experiment , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[31]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[32]  George D. C. Cavalcanti,et al.  META-DES: A dynamic ensemble selection framework using meta-learning , 2015, Pattern Recognit..

[33]  Luiz Eduardo Soares de Oliveira,et al.  Contribution of data complexity features on dynamic classifier selection , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[34]  Paul C. Smits,et al.  Multiple classifier systems for supervised remote sensing image classification based on dynamic classifier selection , 2002, IEEE Trans. Geosci. Remote. Sens..

[35]  Anne M. P. Canuto,et al.  Empirical comparison of Dynamic Classifier Selection methods based on diversity and accuracy for building ensembles , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).