Hybrid ensembles and coincident-failure diversity

This paper presents an approach for constructing hybrid ensembles with two categories of members: trained neural networks and decision trees in the hope of increasing diversity and reducing coincident failures. The diversity among the members of an ensemble has been generally recognised as a key factor for improving the overall performance of the ensemble. Of the two heterogeneous members evaluated with a number of the diversity measures, we found that there is a statistically low level coincident-failure diversity among homogeneous members but a relatively high level diversity between heterogeneous members. This provides theoretical basis for constructing hybrid ensembles with members developed by different methodologies. The hybrid ensembles have been built for a real-world problem: predicting training injury of military recruits. Their performances are evaluated in terms of diversity, reliability and prediction accuracy, and also compared with the homogeneous ensembles of neural nets or decision trees. The results indicate that the hybrid ensembles have a considerably high level diversity and thus are able to produce a better performance.

[1]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Bev Littlewood,et al.  Conceptual Modeling of Coincident Failures in Multiversion Software , 1989, IEEE Trans. Software Eng..

[3]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[4]  William B. Yates,et al.  Engineering Multiversion Neural-Net Systems , 1996, Neural Computation.

[5]  W. Wang,et al.  A Comparative Study of Feature-Salience Ranking Techniques , 2001, Neural Computation.

[6]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[7]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[8]  Derek Partridge,et al.  Diversity between Neural Networks and Decision Trees for Building Multiple Classifier Systems , 2000, Multiple Classifier Systems.

[9]  Derek Partridge,et al.  Software Diversity: Practical Statistics for Its Measurement and Exploitation | Draft Currently under Revision , 1996 .

[10]  Tamás D. Gedeon,et al.  Data Mining of Inputs: Analysing Magnitude and Functional Measures , 1997, Int. J. Neural Syst..

[11]  Dave E. Eckhardt,et al.  A Theoretical Basis for the Analysis of Multiversion Software Subject to Coincident Errors , 1985, IEEE Transactions on Software Engineering.

[12]  Chris Carter,et al.  Multiple decision trees , 2013, UAI.

[13]  Derek Partridge,et al.  Assessing the Impact of Input Features in a Feedforward Neural Network , 2000, Neural Computing & Applications.

[14]  J. Ross Quinlan,et al.  Learning decision tree classifiers , 1996, CSUR.

[15]  Derek Partridge,et al.  Ranking Pattern Recognition Features for Neural Networks , 1999 .