Classification of LIBS protein spectra using support vector machines and adaptive local hyperplanes

In recent years, the spectroscopy community has increasingly been using various techniques for automatic computer assisted quantitative and qualitative evaluation of specimen based on spectroscopy data. In this paper, an automated method for classification of four types of proteins (Bovine Serum Albumin (the most abundant protein in blood plasma), and Osteopontin, Leptin and Insulin-like Growth Factor II) (identified as potential biomarkers for ovarian cancer) from laser-induced breakdown spectroscopy (LIBS) data is proposed. Automatic classification of these complex proteins can lead to the identification of chemical components that are vital in the detection of certain diseases (i.e. ovarian cancer). The LIBS method consists of the laser ablation of the sample by using ultrashort and high energetic laser pulses that break the molecular bonds of the compound. The high intensity electromagnetic field induces multiphoton absorption and ionization, resulting in the formation of short living plasma in the vicinity of the sample. During cooling, this plasma reemits light collected by spectrometers. In this paper, the high dimensional spectroscopy data is preprocessed using linear dimensionality reduction technique of principal component analysis and then classification is performed using support vector machines (SVM) and adaptive local hyperplane (ALH). The influence of the number of extracted features as well as the parameters of the classification algorithms to the classification accuracy is also investigated. Our experiments performed on real life data suggest that both classification methods are quite efficient in distinguishing among four types of proteins and they have a fairly robust detection performance for a range of the numbers of extracted features as well as the algorithms' parameters.

[1]  R L Somorjai,et al.  Near‐optimal region selection for feature space reduction: novel preprocessing methods for classifying MR spectra , 1998, NMR in biomedicine.

[2]  Barry K. Lavine,et al.  Raman Spectroscopy and Genetic Algorithms for the Classification of Wood Types , 2001 .

[3]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[4]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[5]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[6]  Saravanan Dharmaraj,et al.  The classification of Phyllanthus niruri Linn. according to location by infrared spectroscopy , 2006 .

[7]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[8]  Frank C De Lucia,et al.  Laser-induced breakdown spectroscopy for the classification of unknown powders. , 2008, Applied optics.

[9]  Vojislav Kecman,et al.  Adaptive local hyperplane classification , 2008, Neurocomputing.

[10]  Ian T. Jolliffe,et al.  Discarding Variables in a Principal Component Analysis. I: Artificial Data , 1972 .

[11]  Gabriele Schackert,et al.  Classification of human gliomas by infrared imaging spectroscopy and chemometric image processing , 2005 .

[12]  Aleksandar Lazarevic,et al.  Classification of LIBS Protein Spectra Using Multilayer Perceptrons , 2010, Trans. Mass Data Anal. Images Signals.

[13]  Nicolas André,et al.  Extraction of information from laser-induced breakdown spectroscopy spectral data by multivariate analysis. , 2008, Applied optics.

[14]  S. Buckley,et al.  Laser-Induced Breakdown Spectroscopy Detection and Classification of Biological Aerosols , 2003, Applied spectroscopy.

[15]  Plamen Angelov,et al.  Intelligent interrogation of mid-IR spectroscopy data from exfoliative cervical cytology using self-learning classifier eClass , 2008 .

[16]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .