Microarray Feature Selection and Dynamic Selection of Classifiers for Early Detection of Insect Bite Hypersensitivity in Horses

Microarrays can be employed to better characterise allergies, as interactions between antibodies and allergens in mammals can be monitored. Once the joint dynamics of these elements in both healthy and diseased animals are understood, a model to predict the likelihood of an individual having allergic reactions can be defined. We investigate the potential use of Dynamic Selection (DS) methods to classify protein microarray data, with a case study of equine insect bite hypersensitivity (IBH) disease. To the best of our knowledge DS has not yet been applied to these data types. Since most microarrays datasets have a low number of samples, we hypothesise that DS models will produce satisfactory results due to their ability to perform better when compared to traditional ensemble techniques for similar data. We focus on three research questions: 1) What is the potential of DS for microarray data classification and how does it compare with existing classical classification methods results? 2) how do DS methods perform for the IBH dataset? and 3) does feature selection improve DS performance for this data? A wrapper using backward elimination and embedded with a regularized extreme learning machine are adopted to identify the more relevant features influencing the onset of the disease. Results from traditional classifiers are compared to 21 different DS methods before and after performing feature selection. Our results indicate that DS methods do not outperform single and static classifiers on this high-dimensional dataset and their performance also does not improved after feature selection.

[1]  George D. C. Cavalcanti,et al.  Dynamic ensemble selection VS K-NN: Why and when dynamic selection obtains higher classification performance? , 2017, 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA).

[2]  Nikola Bogunovic,et al.  A review of feature selection methods with applications , 2015, 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[3]  Anne M. P. Canuto,et al.  Empirical comparison of Dynamic Classifier Selection methods based on diversity and accuracy for building ensembles , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[4]  Ludmila I. Kuncheva,et al.  Switching between selection and fusion in combining classifiers: an experiment , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[5]  Luiz Eduardo Soares de Oliveira,et al.  Dynamic selection of classifiers - A comprehensive review , 2014, Pattern Recognit..

[6]  Fernando J. Von Zuben,et al.  Improved regularization in extreme learning machines , 2016 .

[7]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[8]  Bartlomiej Antosik,et al.  New Measures of Classifier Competence - Heuristics and Application to the Design of Multiple Classifier Systems , 2011, Computer Recognition Systems 4.

[9]  Fabio Roli,et al.  Methods for dynamic classifier selection , 1999, Proceedings 10th International Conference on Image Analysis and Processing.

[10]  Paul C. Smits,et al.  Multiple classifier systems for supervised remote sensing image classification based on dynamic classifier selection , 2002, IEEE Trans. Geosci. Remote. Sens..

[11]  Robert Sabourin,et al.  Dynamic selection approaches for multiple classifier systems , 2011, Neural Computing and Applications.

[12]  S. K. Rath,et al.  Classification of Microarray Data using Extreme Learning Machine Classifier , 2015 .

[13]  I. Buchan,et al.  Challenges in interpreting allergen microarrays in relation to clinical symptoms: A machine learning approach , 2013, Pediatric allergy and immunology : official publication of the European Society of Pediatric Allergy and Immunology.

[14]  Fabio Roli,et al.  Dynamic classifier selection based on multiple classifier behaviour , 2001, Pattern Recognit..

[15]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[16]  George D. C. Cavalcanti,et al.  META-DES: A dynamic ensemble selection framework using meta-learning , 2015, Pattern Recognit..

[17]  Marek Kurzynski,et al.  A probabilistic model of classifier competence for dynamic ensemble selection , 2011, Pattern Recognit..

[18]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[19]  George D. C. Cavalcanti,et al.  Dynamic classifier selection: Recent advances and perspectives , 2018, Inf. Fusion.

[20]  Xiaoyi Jiang,et al.  A dynamic classifier ensemble selection approach for noise data , 2010, Inf. Sci..

[21]  Marek Kurzynski,et al.  On a New Measure of Classifier Competence Applied to the Design of Multiclassifier Systems , 2009, ICIAP.

[22]  Luiz Eduardo Soares de Oliveira,et al.  Contribution of data complexity features on dynamic classifier selection , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[23]  Kevin W. Bowyer,et al.  Combination of Multiple Classifiers Using Local Accuracy Estimates , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[25]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[26]  Anne M. P. Canuto,et al.  Using Accuracy and Diversity to Select Classifiers to Build Ensembles , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[27]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[28]  Marek Kurzynski,et al.  A measure of competence based on random classification for dynamic ensemble selection , 2012, Inf. Fusion.

[29]  J. Orbach Principles of Neurodynamics. Perceptrons and the Theory of Brain Mechanisms. , 1962 .

[30]  Theofanis Sapatinas,et al.  Discriminant Analysis and Statistical Pattern Recognition , 2005 .

[31]  J. Stuart Aitken,et al.  Feature selection and classification for microarray data analysis: Evolutionary methods for identifying predictive genes , 2005, BMC Bioinformatics.

[32]  Bogdan Gabrys,et al.  Classifier selection for majority voting , 2005, Inf. Fusion.

[33]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[34]  D. Cox The Regression Analysis of Binary Sequences , 2017 .

[35]  Benoît Frénay,et al.  Feature selection for nonlinear models with extreme learning machines , 2013, Neurocomputing.

[36]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[37]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[38]  Robert Sabourin,et al.  From dynamic classifier selection to dynamic ensemble selection , 2008, Pattern Recognit..

[39]  Amar Mitiche,et al.  Classifier combination for hand-printed digit recognition , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[40]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[41]  A. Boner,et al.  A bioinformatics approach to identify patients with symptomatic peanut allergy using peptide microarray immunoassay. , 2012, The Journal of allergy and clinical immunology.

[42]  Richard Weber,et al.  A wrapper method for feature selection using Support Vector Machines , 2009, Inf. Sci..