Investigating a Predictive Certainty measure for Ensemble Based HIV Classification Systems

This paper investigates whether there is a correlation between the predictive certainty measure for ensemble based classifiers and the prediction accuracy. The predictive certainty measure is the percentage of most dominant outcome from all the possible outcomes for the ensemble of classifiers. Three neural network ensemble classifiers were created using Bagging, Boosting and Bayesian Methods. All three ensembles are used to classify a patients HIV status using demographic variables obtained from an antenatal seroprevalence survey. All three ensembles perform equally well for the HIV classification but the ensemble obtained using Bayesian training method is most suited for giving a relevant predictive certainty measure. The predictive certainty measures obtained for the Bagging and Boosting ensembles are not suitable to use as a confidence measure because the prediction accuracy is low for cases that have high predictive certainty. The Bayesian ensemble is more suitable for making decisions.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  B. L. Betechuoh,et al.  Autoencoder networks for HIV classification , 2006 .

[3]  Panlop Zeephongsekul,et al.  Improving the Predictive Power of AdaBoost: A Case Study in Classifying Borrowers , 2003, IEA/AIE.

[4]  Tshilidzi Marwala,et al.  Using Inverse Neural Network for HIV Adaptive Control , 2007 .

[5]  Leo Breiman,et al.  Pasting Small Votes for Classification in Large Databases and On-Line , 1999, Machine Learning.

[6]  Zhengxin Chen,et al.  Classification methods for HIV-1 medicated neuronal damage , 2005, 2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05).

[7]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[8]  Jr-Shin Li,et al.  Ensemble control of linear systems , 2007, 2007 46th IEEE Conference on Decision and Control.

[9]  Umberto D'Alessandro,et al.  The impact of HIV-1 on the malaria parasite biomass in adults in sub-Saharan Africa contributes to the emergence of antimalarial drug resistance , 2008, Malaria Journal.

[10]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[11]  Boonserm Kijsirikul,et al.  Using associative classification for predicting HIV-1 drug resistance , 2004, Fourth International Conference on Hybrid Intelligent Systems (HIS'04).

[12]  Ian T. Nabney,et al.  Netlab: Algorithms for Pattern Recognition , 2002 .

[13]  Jouko Lampinen,et al.  On MCMC sampling in Bayesian MLP neural networks , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[14]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[15]  B. Korber,et al.  A new classification for HIV-1 , 1998, Nature.

[16]  Tshilidzi Marwala,et al.  Prediction of HIV Status from Demographic Data Using Neural Networks , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[17]  Yvan Saeys,et al.  Predicting Human Immunodeficiency Virus (HIV) Drug Resistance Using Recurrent Neural Networks , 2006, IWINAC.