A Normalized Probabilistic Expectation-Maximization Neural Network for Minimizing Bayesian Misclassification Cost Risk

Abstract In this paper, we propose a normalized semi-supervised probabilistic expectation-maximization neural network (PEMNN) that minimizes Bayesian misclassification cost risk. Using simulated and real-world datasets, we compare the proposed PEMNN with supervised cost sensitive probabilistic neural network (PNN), discriminant analysis (DA), mathematical integer programming (MIP) model and support vector machines (SVM) for different misclassification cost asymmetries and class biases. The results of our experiments indicate that the PEMNN performs better when class data distributions are normal or uniform. However, when class data distribution is exponential the performance of PEMNN deteriorates giving slight advantage to competing MIP, DA, PNN and SVM techniques. For real-world data with non-parametric distributions and mixed decision-making attributes (continuous and categorical), the PEMNN outperforms the PNN.

[1]  Antonie Stam,et al.  A mixed integer programming algorithm for minimizing the training sample misclassification cost in two-group classification , 1997, Ann. Oper. Res..

[2]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[3]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[4]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[5]  A. Stam,et al.  Classification performance of mathematical programming techniques in discriminant analysis: Results for small and medium sample sizes , 1990 .

[6]  Yiming Ying,et al.  Support Vector Machine Soft Margin Classifiers: Error Analysis , 2004, J. Mach. Learn. Res..

[7]  Jasni Mohamad Zain,et al.  The Design of Pre-Processing Multidimensional Data Based on Component Analysis , 2011, Comput. Inf. Sci..

[8]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[9]  S. M. Bajgier,et al.  AN EXPERIMENTAL COMPARISON OF STATISTICAL AND LINEAR PROGRAMMING APPROACHES TO THE DISCRIMINANT PROBLEM , 1982 .

[10]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[11]  Parag C. Pendharkar,et al.  Probabilistic Approaches for Credit Screening and bankruptcy Prediction , 2011, Intell. Syst. Account. Finance Manag..

[12]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[13]  T. W. Anderson,et al.  An Introduction to Multivariate Statistical Analysis , 1959 .

[14]  Paul S. Bradley,et al.  Mathematical Programming for Data Mining: Formulations and Challenges , 1999, INFORMS J. Comput..

[15]  Alexander Schliep,et al.  Comparative study on normalization procedures for cluster analysis of gene expression datasets , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[16]  J. Schmee An Introduction to Multivariate Statistical Analysis , 1986 .

[17]  Siddhartha Bhattacharyya,et al.  Inductive, Evolutionary, and Neural Computing Techniques for Discrimination: A Comparative Study* , 1998 .

[18]  Ian Witten,et al.  Data Mining , 2000 .

[19]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[20]  Jian Pei,et al.  2012- Data Mining. Concepts and Techniques, 3rd Edition.pdf , 2012 .

[21]  Ian H. Witten,et al.  Chapter 1 – What's It All About? , 2011 .

[22]  Paul A. Rubin,et al.  Heuristic solution procedures for a mixed‐integer programming discriminant model , 1990 .

[23]  Nuno Vasconcelos,et al.  Risk minimization, probability elicitation, and cost-sensitive SVMs , 2010, ICML.

[24]  Lazaros G. Papageorgiou,et al.  A mixed integer optimisation model for data classification , 2009, Comput. Ind. Eng..

[25]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .