A New Belief-based Classification Fusion for Incomplete Data

Reducing the negative impact of estimation on classifier performance in training set is one of the most challenging tasks in incomplete data classification. A new belief-based classification fusion method (BCF) is proposed for incomplete data in this paper and the core idea is to make full use of the existing attributes of incomplete objects in training set to improve the performance of basic classifier without deleting or estimation strategy. Specifically, for a data set with n-dimensional attributes, different attributes generate p (p ≤ n) subsets according to prior knowledge or random combination. Then, $p$ trained basic classifiers (such as SVM) will be obtained with complete objects from corresponding $p$ training subsets, and estimation strategy is used to fill the incomplete objects in the test set. Finally, DS rule is used to fuse $p$ sub-classification results if they do not conflict and a new global fusion method is proposed to fuse the remaining conflict sub-classification results, which can submit the object difficult to be accurately classified into a singleton (special) class to meta-class to reduce error rate and characterize the uncertainly caused by missing values well. Our simulation results illustrate the potential of the proposed method using real data sets, and they show that BCF can improve substantially the classification accuracy.

[1]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[2]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[3]  Quan Pan,et al.  Classifier fusion based on cautious discounting of beliefs , 2016, 2016 19th International Conference on Information Fusion (FUSION).

[4]  Quan Pan,et al.  Combination of Classifiers With Optimal Weight Based on Evidential Reasoning , 2018, IEEE Transactions on Fuzzy Systems.

[5]  Thierry Denoeux,et al.  EVCLUS: evidential clustering of proximity data , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[6]  Quan Pan,et al.  Hybrid Classification System for Uncertain Data , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[7]  Thierry Denoeux,et al.  A k-nearest neighbor classification rule based on Dempster-Shafer theory , 1995, IEEE Trans. Syst. Man Cybern..

[8]  Lukasz A. Kurgan,et al.  Impact of imputation of missing values on classification error for discrete data , 2008, Pattern Recognit..

[9]  Philippe Smets,et al.  The Transferable Belief Model , 1991, Artif. Intell..

[10]  Thierry Denoeux,et al.  ECM: An evidential version of the fuzzy c , 2008, Pattern Recognit..

[11]  Bobby D. Gerardo,et al.  An education data mining tool for marketing based on C4.5 classification technique , 2013, 2013 Second International Conference on E-Learning and E-Technologies in Education (ICEEE).

[12]  Bhavani M. Thuraisingham,et al.  An Effective Evidence Theory Based K-Nearest Neighbor (KNN) Classification , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[13]  Jean Dezert,et al.  Credal c-means clustering method based on belief functions , 2015, Knowl. Based Syst..

[14]  Amaury Lendasse,et al.  Extreme learning machine for missing data using multiple imputations , 2016, Neurocomputing.

[15]  Gustavo E. A. P. A. Batista,et al.  A Study of K-Nearest Neighbour as an Imputation Method , 2002, HIS.

[16]  David E. Booth,et al.  Analysis of Incomplete Multivariate Data , 2000, Technometrics.

[17]  Quan Pan,et al.  A New Incomplete Pattern Classification Method Based on Evidential Reasoning , 2015, IEEE Transactions on Cybernetics.

[18]  Jiye Liang,et al.  An efficient instance selection algorithm for k nearest neighbor regression , 2017, Neurocomputing.

[19]  Francisco Herrera,et al.  Missing data imputation for fuzzy rule-based classification systems , 2012, Soft Comput..

[20]  Srinivasan Ramakrishnan,et al.  Hierarchical multi-class SVM with ELM kernel for epileptic EEG signal classification , 2015, Medical & Biological Engineering & Computing.

[21]  Aníbal R. Figueiras-Vidal,et al.  Pattern classification with missing data: a review , 2010, Neural Computing and Applications.