Stacking for Ensembles of Local Experts in Metabonomic Applications

Recently, Ensembles of local experts have successfully been applied for the automatic detection of drug-induced organ toxicities based on spectroscopic data. For suitable Ensemble composition an expert selection optimization procedure is required that identifies the most relevant classifiers to be integrated. However, it has been observed that Ensemble optimization tends to overfit on the training data. To tackle this problem we propose to integrate a stacked classifier optimized via cross-validation that is based on the outputs of local experts. In order to achieve probabilistic outputs of Support Vector Machines used as local experts we apply a sigmoidal fitting approach. The results of an experimental evaluation on a challenging data set from safety pharmacology demonstrate the improved generalizability of the proposed approach.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Charles C. Taylor,et al.  Boosting kernel density estimates: A bias reduction technique? , 2004 .

[3]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[4]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[5]  David M. J. Tax,et al.  Kernel Whitening for One-Class Classification , 2002, Int. J. Pattern Recognit. Artif. Intell..

[6]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[7]  Edwin R. Hancock,et al.  Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop, SSPR&SPR 2010, Cesme, Izmir, Turkey, August 18-20, 2010. Proceedings , 2010, SSPR/SPR.

[8]  Heiko Hoffmann,et al.  Kernel PCA for novelty detection , 2007, Pattern Recognit..

[9]  Timothy M. D. Ebbels,et al.  Toxicity classification from metabonomic data using a density superposition approach: ‘CLOUDS’ , 2003 .

[10]  Ian H. Witten,et al.  Issues in Stacked Generalization , 2011, J. Artif. Intell. Res..

[11]  Luiz Eduardo Soares de Oliveira,et al.  Overfitting in the selection of classifier ensembles: a comparative study between PSO and GA , 2008, GECCO '08.

[12]  Ian H. Witten,et al.  One-Class Classification by Combining Density and Class Probability Estimation , 2008, ECML/PKDD.

[13]  Henrik Antti,et al.  Contemporary issues in toxicology the role of metabonomics in toxicology and its evaluation by the COMET project. , 2003, Toxicology and applied pharmacology.

[14]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[15]  Alexander J. Smola,et al.  Advances in Large Margin Classifiers , 2000 .

[16]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[17]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[18]  J. Lindon,et al.  Metabonomics: a platform for studying drug toxicity and gene function , 2002, Nature Reviews Drug Discovery.

[19]  Kai Lienemann,et al.  Automatic Classification of NMR Spectra by Ensembles of Local Experts , 2008, SSPR/SPR.

[20]  Robert P. W. Duin,et al.  Combining One-Class Classifiers , 2001, Multiple Classifier Systems.

[21]  Guofei Gu,et al.  Using an Ensemble of One-Class SVM Classifiers to Harden Payload-based Anomaly Detection Systems , 2006, Sixth International Conference on Data Mining (ICDM'06).

[22]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[23]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[24]  Kai Lienemann,et al.  On the Application of SVM-Ensembles Based on Adapted Random Subspace Sampling for Automatic Classification of NMR Data , 2007, MCS.

[25]  T. Ebbels,et al.  NMR-based metabonomic toxicity classification: hierarchical cluster analysis and k-nearest-neighbour approaches , 2003 .

[26]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[27]  Kai Lienemann,et al.  NMR-based urine analysis in rats: prediction of proximal tubule kidney toxicity and phospholipidosis. , 2008, Journal of pharmacological and toxicological methods.

[28]  Mário A. T. Figueiredo,et al.  Soft clustering using weighted one-class support vector machines , 2009, Pattern Recognit..

[29]  David M. J. Tax,et al.  One-class classification , 2001 .

[30]  V. Roth Kernel Fisher Discriminants for Outlier Detection , 2006 .

[31]  E Holmes,et al.  Development of a model for classification of toxin‐induced lesions using 1H NMR spectroscopy of urine combined with pattern recognition , 1998, NMR in biomedicine.

[32]  Robert P. W. Duin,et al.  Support vector domain description , 1999, Pattern Recognit. Lett..

[33]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[34]  Lefteri H. Tsoukalas,et al.  Neural network methodology for /sup 1/H NMR spectroscopy classification , 1999, Proceedings 1999 International Conference on Information Intelligence and Systems (Cat. No.PR00446).

[35]  Chen Li,et al.  Bagging One-Class Decision Trees , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.