Extreme Learning Machines for Multiclass Classification: Refining Predictions with Gaussian Mixture Models

This paper presents an extension of the well-known Extreme Learning Machines (ELMs). The main goal is to provide probabilities as outputs for Multiclass Classification problems. Such information is more useful in practice than traditional crisp classification outputs. In summary, Gaussian Mixture Models are used as post-processing of ELMs. In that context, the proposed global methodology is keeping the advantages of ELMs (low computational time and state of the art performances) and the ability of Gaussian Mixture Models to deal with probabilities. The methodology is tested on 3 toy examples and 3 real datasets. As a result, the global performances of ELMs are slightly improved and the probability outputs are seen to be accurate and useful in practice.

[1]  Jack W. Stokes,et al.  Large-scale malware classification using random projections and neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[3]  김용수,et al.  Extreme Learning Machine 기반 퍼지 패턴 분류기 설계 , 2015 .

[4]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[5]  Amaury Lendasse,et al.  Image-based Classification of Websites , 2013 .

[6]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[7]  K. S. Banerjee Generalized Inverse of Matrices and Its Applications , 1973 .

[8]  Amaury Lendasse,et al.  Finding Originally Mislabels with MD-ELM , 2014, ESANN.

[9]  Hongming Zhou,et al.  Extreme Learning Machines [Trends & Controversies] , 2013 .

[10]  Amaury Lendasse,et al.  Mixture of Gaussians for distance estimation with missing data , 2014, Neurocomputing.

[11]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[12]  Benoît Frénay,et al.  Feature selection for nonlinear models with extreme learning machines , 2013, Neurocomputing.

[13]  Amaury Lendasse,et al.  ELM Clustering – Application to Bankruptcy Prediction , 2014 .

[14]  Michael J. Watts,et al.  IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Publication Information , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Amaury Lendasse,et al.  Adaptive Ensemble Models of Extreme Learning Machines for Time Series Prediction , 2009, ICANN.

[16]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[17]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[18]  Brian D. Davison,et al.  Web page classification: Features and algorithms , 2009, CSUR.

[19]  Amaury Lendasse,et al.  Extreme Learning Machine: A Robust Modeling Technique? Yes! , 2013, IWANN.

[20]  Jean-Paul Chilès,et al.  Wiley Series in Probability and Statistics , 2012 .

[21]  C. R. Rao,et al.  Generalized Inverse of Matrices and its Applications , 1972 .

[22]  Erkki Oja,et al.  GPU-accelerated and parallelized ELM ensembles for large-scale regression , 2011, Neurocomputing.

[23]  Amaury Lendasse,et al.  Ensemble delta test-extreme learning machine (DT-ELM) for regression , 2014, Neurocomputing.

[24]  Amaury Lendasse,et al.  Long-term time series prediction using OP-ELM , 2014, Neural Networks.

[25]  David M. Allen,et al.  The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction , 1974 .

[26]  Chi-Man Vong,et al.  Sparse Bayesian Extreme Learning Machine for Multi-classification , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[27]  Carsten Willems,et al.  Automatic analysis of malware behavior using machine learning , 2011, J. Comput. Secur..

[28]  Amaury Lendasse,et al.  Evolving fuzzy optimally pruned extreme learning machine for regression problems , 2010, Evol. Syst..

[29]  Amaury Lendasse,et al.  TROP-ELM: A double-regularized ELM using LARS and Tikhonov regularization , 2011, Neurocomputing.

[30]  Amaury Lendasse,et al.  A Two-Stage Methodology Using K-NN and False-Positive Minimizing ELM for Nominal Data Classification , 2014, Cognitive Computation.

[31]  Amaury Lendasse,et al.  OP-ELM: Theory, Experiments and a Toolbox , 2008, ICANN.

[32]  Victor C. M. Leung,et al.  Extreme Learning Machines [Trends & Controversies] , 2013, IEEE Intelligent Systems.

[33]  Amaury Lendasse,et al.  Regularized extreme learning machine for regression with missing data , 2013, Neurocomputing.

[34]  R. H. Myers Classical and modern regression with applications , 1986 .

[35]  Amaury Lendasse,et al.  OP-ELM: Optimally Pruned Extreme Learning Machine , 2010, IEEE Transactions on Neural Networks.

[36]  Ajay S. Patil,et al.  Automated Classification of Web Sites using Naive Bayesian Algorithm , 2012 .

[37]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[38]  Miki Sirola,et al.  SOM based methods in early fault detection of nuclear industry , 2009, ESANN.