SVM ensemble classification of nmr spectra based on different configurations of data processing techniques

The early detection of drug-induced organ toxicities is one of the major goals in safety pharmacology. Automating this process by classification of metabolic changes based on the analysis of 1H nuclear magnetic resonance spectra improves this process. In this paper we propose an ensemble classification system based on support vector machines trained on diverse ldquoviewsrdquo on the data. These views are created by variation of preprocessing techniques and the final classification is achieved by voting on an optimized selection of all experts. Results of an experimental evaluation on a challenging data-set from industrial safety pharmacology show the effectiveness of the proposed approach w.r.t. the detection of drug-induced toxicity.

[1]  E Holmes,et al.  Chemometric models for toxicity classification based on NMR spectra of biofluids. , 2000, Chemical research in toxicology.

[2]  E Holmes,et al.  Automatic reduction of NMR spectroscopic data for statistical and pattern recognition classification of samples. , 1994, Journal of pharmaceutical and biomedical analysis.

[3]  Kai Lienemann,et al.  NMR-based urine analysis in rats: prediction of proximal tubule kidney toxicity and phospholipidosis. , 2008, Journal of pharmacological and toxicological methods.

[4]  R. Barnes,et al.  Standard Normal Variate Transformation and De-Trending of Near-Infrared Diffuse Reflectance Spectra , 1989 .

[5]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[6]  Elaine Holmes,et al.  Prediction and classification of drug toxicity using probabilistic modeling of temporal metabolic data: the consortium on metabonomic toxicology screening approach. , 2007, Journal of proteome research.

[7]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[8]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[9]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[10]  Timothy M. D. Ebbels,et al.  Toxicity classification from metabonomic data using a density superposition approach: ‘CLOUDS’ , 2003 .

[11]  Kai Lienemann,et al.  On the Application of SVM-Ensembles Based on Adapted Random Subspace Sampling for Automatic Classification of NMR Data , 2007, MCS.

[12]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[13]  S. Jacobsson,et al.  Multivariate analysis of NMR spectra for saponins from Quillaja saponaria Molina , 2001 .