A neural network classifier based on Dempster-Shafer theory

A new adaptive pattern classifier based on the Dempster-Shafer theory of evidence is presented. This method uses reference patterns as items of evidence regarding the class membership of each input pattern under consideration. This evidence is represented by basic belief assignments (BBA) and pooled using the Dempster's rule of combination. This procedure can be implemented in a multilayer neural network with specific architecture consisting of one input layer, two hidden layers and one output layer. The weight vector, the receptive field and the class membership of each prototype are determined by minimizing the mean squared differences between the classifier outputs and target values. After training, the classifier computes for each input vector a BBA that provides a description of the uncertainty pertaining to the class of the current pattern, given the available evidence. This information may be used to implement various decision rules allowing for ambiguous pattern rejection and novelty detection. The outputs of several classifiers may also be combined in a sensor fusion context, yielding decision procedures which are very robust to sensor failures or changes in the system environment. Experiments with simulated and real data demonstrate the excellent performance of this classification scheme as compared to existing statistical and neural network techniques.

[1]  Eric B. Baum,et al.  Supervised Learning of Probability Distributions by Neural Networks , 1987, NIPS.

[2]  J. Kohlas,et al.  A Mathematical Theory of Hints: An Approach to the Dempster-Shafer Theory of Evidence , 1995 .

[3]  Jürg Kohlas,et al.  Handbook of Defeasible Reasoning and Uncertainty Management Systems , 2000 .

[4]  Denoeux 1 - Application du modèle des croyances transférables en reconnaissance de formes , 1997 .

[5]  R. Tibshirani,et al.  Flexible Discriminant Analysis by Optimal Scoring , 1994 .

[6]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[7]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[8]  Galina L. Rogova,et al.  Combining the results of several neural network classifiers , 1994, Neural Networks.

[9]  Jerome H. Friedman Multivariate adaptive regression splines (with discussion) , 1991 .

[10]  C. K. Chow,et al.  On optimum recognition error and reject tradeoff , 1970, IEEE Trans. Inf. Theory.

[11]  David E. Rumelhart,et al.  Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.

[12]  Jürg Kohlas,et al.  A Mathematical Theory of Hints , 1995 .

[13]  Brian D. Ripley,et al.  Flexible Non-linear Approaches to Classification , 1994 .

[14]  Thierry Denoeux,et al.  Analysis of evidence-theoretic decision rules for pattern classification , 1997, Pattern Recognit..

[15]  Louis ten Bosch,et al.  Speaker normalization for automatic speech recognition — An on-line approach , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[16]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[17]  Trevor Hastie,et al.  Nonparametric Regression and Classification Part II—Nonparametric Classification , 1994 .

[18]  Philippe Smets,et al.  The Combination of Evidence in the Transferable Belief Model , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[20]  Thierry Denoeux,et al.  A k-nearest neighbor classification rule based on Dempster-Shafer theory , 1995, IEEE Trans. Syst. Man Cybern..

[21]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[22]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[23]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[24]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[25]  Bernard Dubuisson,et al.  A statistical decision rule with incomplete knowledge about classes , 1993, Pattern Recognit..

[26]  Tomaso A. Poggio,et al.  Extensions of a Theory of Networks for Approximation and Learning , 1990, NIPS.

[27]  Thierry Denoeux,et al.  An evidence-theoretic k-NN rule with parameter optimization , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[28]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[29]  J. Kacprzyk,et al.  Advances in the Dempster-Shafer theory of evidence , 1994 .

[30]  Philippe Smets,et al.  The Transferable Belief Model for Quantified Belief Representation , 1998 .

[31]  J. Friedman Multivariate adaptive regression splines , 1990 .