Machine Learning Approach to Raman Spectrum Analysis of MIA PaCa-2 Pancreatic Cancer Tumor Repopulating Cells for Classification and Feature Analysis

A machine learning approach is applied to Raman spectra of cells from the MIA PaCa-2 human pancreatic cancer cell line to distinguish between tumor repopulating cells (TRCs) and parental control cells, and to aid in the identification of molecular signatures. Fifty-one Raman spectra from the two types of cells are analyzed to determine the best combination of data type, dimension size, and classification technique to differentiate the cell types. An accuracy of 0.98 is obtained from support vector machine (SVM) and k-nearest neighbor (kNN) classifiers with various dimension reduction and feature selection tools. We also identify some possible biomolecules that cause the spectral peaks that led to the best results.

[1]  M. Ponz-Sarvisé,et al.  Use of Machine-Learning Algorithms in Intensified Preoperative Therapy of Pancreatic Cancer to Predict Individual Risk of Relapse , 2019, Cancers.

[2]  G. Kaushik,et al.  Targeting Cancer Stem Cells for Chemoprevention of Pancreatic Cancer. , 2017, Current medicinal chemistry.

[3]  N. Dubrawsky Cancer statistics , 1989, CA: a cancer journal for clinicians.

[4]  M. Falasca,et al.  Pancreatic Ductal Adenocarcinoma: Current and Evolving Therapies , 2017, International journal of molecular sciences.

[5]  H. Wulf,et al.  Diagnosis of Basal Cell Carcinoma by Raman Spectroscopy , 1997 .

[6]  M. Hilario,et al.  Processing and classification of protein mass spectra. , 2006, Mass spectrometry reviews.

[7]  Christopher J. Frank,et al.  Raman spectroscopy of normal and diseased human breast tissues. , 1995, Analytical chemistry.

[8]  A. Mahadevan-Jansen,et al.  Dual excitation wavelength system for combined fingerprint and high wavenumber Raman spectroscopy. , 2018, The Analyst.

[9]  R. Dasari,et al.  Raman microspectroscopic model of human breast tissue: implications for breast cancer diagnosis in vivo , 2002 .

[10]  C. Perlaki,et al.  Sustained and Cost Effective Silver Substrate for Surface Enhanced Raman Spectroscopy Based Biosensing , 2017, Scientific Reports.

[11]  Joe M. Byrne,et al.  A study examining the effects of tissue processing on human tissue sections using vibrational spectroscopy , 2005 .

[12]  M. Fleischmann,et al.  Raman spectra of pyridine adsorbed at a silver electrode , 1974 .

[13]  Christoph Krafft,et al.  Near infrared Raman spectra of human brain lipids. , 2005, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[14]  Hiro-o Hamaguchi,et al.  Near‐infrared Raman spectroscopy of human lung tissues: possibility of molecular‐level cancer diagnosis , 2001 .

[15]  A. Talari,et al.  Raman Spectroscopy of Biological Tissues , 2007 .

[16]  S. Hassing What Is Vibrational Raman Spectroscopy: A Vibrational or an Electronic Spectroscopic Technique or Both? , 2019, Modern Spectroscopic Techniques and Applications.

[17]  V. B. Kartha,et al.  Discrimination of normal, inflammatory, premalignant, and malignant oral tissue: A Raman spectroscopy study , 2006, Biopolymers.

[18]  A. Mahadevan-Jansen,et al.  Near‐Infrared Raman Spectroscopy for In Vitro Detection of Cervical Precancers , 1998 .

[19]  H. Bruining,et al.  In vivo confocal Raman microspectroscopy of the skin: noninvasive determination of molecular concentration profiles. , 2001, The Journal of investigative dermatology.

[20]  Chetan Shende,et al.  Analysis of 5-fluorouracil in saliva using surface-enhanced Raman spectroscopy , 2005 .

[21]  Dieter Naumann,et al.  Infrared and NIR Raman spectroscopy in medical microbiology , 1998, Photonics West - Biomedical Optics.

[22]  S. Haan,et al.  Tumor-Initiating Cells: a criTICal review of isolation approaches and new challenges in targeting strategies , 2017, Molecular Cancer.

[23]  P. Chu,et al.  Fundamentals and applications of surface-enhanced Raman spectroscopy–based biosensors , 2020 .

[24]  Richard Simon,et al.  Overfitting in prediction models - is it a problem only in high dimensions? , 2013, Contemporary clinical trials.

[25]  S. Lane,et al.  Micro-Raman spectroscopy detects individual neoplastic and normal hematopoietic cells. , 2006, Biophysical journal.

[26]  M. Borgognone,et al.  Principal component analysis in sensory analysis: covariance or correlation matrix? , 2001 .

[27]  Ilya Levner,et al.  Feature selection and nearest centroid classification for protein mass spectrometry , 2005, BMC Bioinformatics.

[28]  H. Barr,et al.  Raman spectroscopy for identification of epithelial cancers. , 2004, Faraday discussions.

[29]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[30]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[31]  Georgia D. Tourassi,et al.  Data mining in proteomic mass spectrometry , 2006, Clinical Proteomics.

[32]  Gregory W. Auner,et al.  Applications of Raman spectroscopy in cancer diagnosis , 2018, Cancer and Metastasis Reviews.

[33]  Huiqing Liu,et al.  A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. , 2002, Genome informatics. International Conference on Genome Informatics.

[34]  Benyamin Ghojogh,et al.  The Theory Behind Overfitting, Cross Validation, Regularization, Bagging, and Boosting: Tutorial , 2019, ArXiv.

[35]  Jing Liu,et al.  Soft fibrin gels promote selection and growth of tumourigenic cells , 2012, Nature Materials.

[36]  C. Murali Krishna,et al.  Tissue Raman Spectroscopy for the Study of Radiation Damage: Brain Irradiation of Mice , 2002, Radiation research.

[37]  Landulfo Silveira,et al.  Correlation between near‐infrared Raman spectroscopy and the histopathological analysis of atherosclerosis in human coronary arteries , 2002, Lasers in surgery and medicine.

[38]  Samuel A. Williams,et al.  Patient-derived xenografts, the cancer stem cell paradigm, and cancer pathobiology in the 21st century , 2013, Laboratory Investigation.

[39]  J. Iovanna,et al.  Pancreatic cancer chemo-resistance is driven by tumor phenotype rather than tumor genotype , 2018, Heliyon.

[40]  Paul Terry,et al.  Application of the GA/KNN method to SELDI proteomics data , 2004, Bioinform..

[41]  Wen-ting Cheng,et al.  Micro‐Raman spectroscopy used to identify and grade human skin pilomatrixoma , 2005, Microscopy research and technique.

[42]  Elena Marchiori,et al.  Feature Selection for Classification with Proteomic Data of Mixed Quality , 2005, 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[43]  A. Jemal,et al.  Cancer statistics, 2019 , 2019, CA: a cancer journal for clinicians.

[44]  L L Hench,et al.  Discrimination between ricin and sulphur mustard toxicity in vitro using Raman spectroscopy , 2004, Journal of The Royal Society Interface.

[45]  Dustin W. Shipp,et al.  Raman spectroscopy: techniques and applications in the life sciences , 2017 .

[46]  Hartwig Schulz,et al.  Identification and quantification of valuable plant substances by IR and Raman spectroscopy , 2007 .

[47]  Miguel Ángel Medina,et al.  Characterization by Raman spectroscopy of conformational changes on guanine–cytosine and adenine–thymine oligonucleotides induced by aminooxy analogues of spermidine , 2004 .

[48]  H. Lui,et al.  Raman spectroscopy for optical diagnosis in normal and cancerous tissue of the nasopharynx—preliminary findings , 2003, Lasers in surgery and medicine.

[49]  Rina K. Dukor,et al.  Vibrational Spectroscopy in the Detection of Cancer , 2006 .

[50]  H. Barr,et al.  Raman spectroscopy: elucidation of biochemical changes in carcinogenesis of oesophagus , 2006, British Journal of Cancer.