ELM-based gene expression classification with misclassification cost

Cost-sensitive learning is a crucial problem in machine learning research. Traditional classification problem assumes that the misclassification for each category has the same cost, and the target of learning algorithm is to minimize the expected error rate. In cost-sensitive learning, costs of misclassification for samples of different categories are not the same; the target of algorithm is to minimize the sum of misclassification cost. Cost-sensitive learning can meet the actual demand of real-life classification problems, such as medical diagnosis, financial projections, and so on. Due to fast learning speed and perfect performance, extreme learning machine (ELM) has become one of the best classification algorithms, while voting based on extreme learning machine (V-ELM) makes classification results more accurate and stable. However, V-ELM and some other versions of ELM are all based on the assumption that all misclassifications have same cost. Therefore, they cannot solve cost-sensitive problems well. To overcome the drawback of ELMs mentioned above, an algorithm called cost-sensitive ELM (CS-ELM) is proposed by introducing misclassification cost of each sample into V-ELM. Experimental results on gene expression data show that CS-ELM is effective in reducing misclassification cost.

[1]  C. K. Chow,et al.  On optimum recognition error and reject tradeoff , 1970, IEEE Trans. Inf. Theory.

[2]  Nathalie Japkowicz,et al.  The Class Imbalance Problem: Significance and Strategies , 2000 .

[3]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[4]  Robert P. W. Duin,et al.  The interaction between classification and reject performance for distance-based reject-option classifiers , 2006, Pattern Recognit. Lett..

[5]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[6]  Charles X. Ling,et al.  Data Mining for Direct Marketing: Problems and Solutions , 1998, KDD.

[7]  Zhi-Hua Zhou,et al.  Exploratory Under-Sampling for Class-Imbalance Learning , 2006, Sixth International Conference on Data Mining (ICDM'06).

[8]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[9]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[10]  Mario Vento,et al.  Multiclassification: reject criteria for the Bayesian combiner , 1999, Pattern Recognit..

[11]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[12]  Foster J. Provost,et al.  Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[13]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[14]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[15]  Salvatore J. Stolfo,et al.  Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection , 1998, KDD.

[16]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[17]  Li Ping,et al.  SVM-based Cost Sensitive Mining , 2006 .

[18]  Mario Vento,et al.  To reject or not to reject: that is the question-an answer in case of neural classifiers , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[19]  Nan Liu,et al.  Voting based extreme learning machine , 2012, Inf. Sci..

[20]  Stan Matwin,et al.  Learning When Negative Examples Abound , 1997, ECML.

[21]  Robert C. Holte,et al.  Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria , 2000, ICML.