Support vector machines for the discrimination of analytical chemical data: application to the determination of tablet production by pyrolysis-gas chromatography-mass spectrometry

This paper describes the application of support vector machines (SVM) to analytical chemical data, and is exemplified by the application to the determination of tablet production using pyrolysis-gas chromatography-mass spectrometry. An approach relying on SVM in conjunction with other chemometrics tools such as principal component analysis and discriminant analysis is presented. The ability for SVM to generalize well makes this technique attractive when dealing with limited sized training sets. By using appropriate kernels, SVM result in classifiers of diverse complexity able to draw non-linear decision class boundaries that may suit composite distributions. Principal component analysis and discriminant analysis by means of Mahalanobis distance are used in a stepwise procedure for extracting and selecting meaningful features from the pyrolysis spectrum, in order to feed various SVM classifiers. Results show that discrimination is achievable between the two methods, with SVM performing better than discriminant analysis on the dataset investigated.

[1]  Richard G. Brereton,et al.  Introduction to multivariate calibration in analytical chemistry , 2000 .

[2]  Richard G. Brereton,et al.  Discrimination between tablet production methods using pyrolysis-gas chromatography-mass spectrometry and pattern recognition. , 2003, The Analyst.

[3]  Colin Campbell,et al.  Kernel methods: a survey of current techniques , 2002, Neurocomputing.

[4]  Laila Stordrange,et al.  The morphological score and its application to chemical rank determination , 2000 .

[5]  Zhirong Sun,et al.  Identifying genes related to drug anticancer mechanisms using support vector machine , 2002, FEBS letters.

[6]  David E. Booth,et al.  Chemometrics: Data Analysis for the Laboratory and Chemical Plant , 2004, Technometrics.

[7]  R. Brereton,et al.  Analysis of badger urine volatiles using gas chromatography-mass spectrometry and pattern recognition techniques. , 2001, The Analyst.

[8]  Alfonso Valencia,et al.  Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology , 1996 .

[9]  Local chemical rank estimation of two-way data in the presence of heteroscedastic noise: A morphological approach , 1996 .

[10]  Lynne Boddy,et al.  Support vector machines for identifying organisms: a comparison with strongly partitioned radial basis function networks , 2001 .

[11]  A. Belousov,et al.  Applicational aspects of support vector machines , 2002 .

[12]  A. Belousov,et al.  A flexible classification approach with optimal generalisation performance: support vector machines , 2002 .

[13]  D B Kell,et al.  Discrimination between methicillin-resistant and methicillin-susceptible Staphylococcus aureus using pyrolysis mass spectrometry and artificial neural networks. , 1998, The Journal of antimicrobial chemotherapy.

[14]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[15]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[16]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[17]  Richard G. Brereton,et al.  Multivariate Pattern Recognition in Chemometrics: Illustrated by Case Studies , 1992 .

[18]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.