Dimensional reduction based on peak fitting of Raman micro spectroscopy data improves detection of prostate cancer in tissue specimens

Abstract. Significance: Prostate cancer is the most common cancer among men. An accurate diagnosis of its severity at detection plays a major role in improving their survival. Recently, machine learning models using biomarkers identified from Raman micro-spectroscopy discriminated intraductal carcinoma of the prostate (IDC-P) from cancer tissue with a ≥85  %   detection accuracy and differentiated high-grade prostatic intraepithelial neoplasia (HGPIN) from IDC-P with a ≥97.8  %   accuracy. Aim: To improve the classification performance of machine learning models identifying different types of prostate cancer tissue using a new dimensional reduction technique. Approach: A radial basis function (RBF) kernel support vector machine (SVM) model was trained on Raman spectra of prostate tissue from a 272-patient cohort (Centre hospitalier de l’Université de Montréal, CHUM) and tested on two independent cohorts of 76 patients [University Health Network (UHN)] and 135 patients (Centre hospitalier universitaire de Québec-Université Laval, CHUQc-UL). Two types of engineered features were used. Individual intensity features, i.e., Raman signal intensity measured at particular wavelengths and novel Raman spectra fitted peak features consisting of peak heights and widths. Results: Combining engineered features improved classification performance for the three aforementioned classification tasks. The improvements for IDC-P/cancer classification for the UHN and CHUQc-UL testing sets in accuracy, sensitivity, specificity, and area under the curve (AUC) are (numbers in parenthesis are associated with the CHUQc-UL testing set): +4  %   (+8  %  ), +7  %   (+9  %  ), +2  %   (6%), +9 (+9) with respect to the current best models. Discrimination between HGPIN and IDC-P was also improved in both testing cohorts: +2.2  %   (+1.7  %  ), +4.5  %   (+3.6  %  ), +0  %   (+0  %  ), +2.3 (+0). While no global improvements were obtained for the normal versus cancer classification task [+0  %   (−2  %  ), +0  %   (−3  %  ), +2  %   (−2  %  ), +4 (+3)], the AUC was improved in both testing sets. Conclusions: Combining individual intensity features and novel Raman fitted peak features, improved the classification performance on two independent and multicenter testing sets in comparison to using only individual intensity features.

[1]  Marco Dorigo,et al.  Optimization, Learning and Natural Algorithms , 1992 .

[2]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[3]  N Stone,et al.  The use of Raman spectroscopy to identify and grade prostatic adenocarcinoma in vitro , 2003, British Journal of Cancer.

[4]  Gregory W. Auner,et al.  Emerging technology: applications of Raman spectroscopy for prostate cancer , 2014, Cancer and Metastasis Reviews.

[5]  Jun Wang,et al.  Raman spectroscopy, a potential tool in diagnosis and prognosis of castration-resistant prostate cancer , 2013, Journal of biomedical optics.

[6]  Sergio Ruiz-Moreno,et al.  Morphology-Based Automated Baseline Removal for Raman Spectra of Artistic Pigments , 2010, Applied spectroscopy.

[7]  Alain Bergeron,et al.  Identification of intraductal carcinoma of the prostate on tissue specimens using Raman micro-spectroscopy: A diagnostic accuracy case–control study with multicohort validation , 2020, PLoS medicine.

[8]  Kevin Petrecca,et al.  Feature engineering applied to intraoperative in vivo Raman spectroscopy sheds light on molecular processes in brain cancer: a retrospective study of 65 patients. , 2019, The Analyst.

[9]  Holger Moch,et al.  The 2016 WHO Classification of Tumours of the Urinary System and Male Genital Organs-Part B: Prostate and Bladder Tumours. , 2016, European urology.

[10]  J. Cheville,et al.  Reporting Practices and Resource Utilization in the Era of Intraductal Carcinoma of the Prostate , 2019, The American journal of surgical pathology.

[11]  P. Hamilton,et al.  Raman microscopy for the chemometric analysis of tumor cells. , 2006, The journal of physical chemistry. B.

[12]  Ratna Naik,et al.  Detection of benign epithelia, prostatic intraepithelial neoplasia, and cancer regions in radical prostatectomy tissues using Raman spectroscopy , 2010 .

[13]  Bernhard Schölkopf,et al.  Use of the Zero-Norm with Linear Models and Kernel Methods , 2003, J. Mach. Learn. Res..

[14]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[15]  Joachim Denzler,et al.  Finding discriminative features for Raman spectroscopy , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[16]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[17]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[18]  F. Martin,et al.  Discrimination of zone-specific spectral signatures in normal human prostate using Raman spectroscopy. , 2010, The Analyst.

[19]  John Maier,et al.  Raman spectral imaging of prostate cancer: can Raman molecular imaging be used to augment standard histopathology? , 2010, BJU international.

[20]  N Stone,et al.  The use of Raman spectroscopy to differentiate between different prostatic adenocarcinoma cell lines , 2005, British Journal of Cancer.

[21]  B. Wilson,et al.  A review of Raman spectroscopy advances with an emphasis on clinical translation challenges in oncology , 2016, Physics in medicine and biology.

[22]  A. Talari,et al.  Raman Spectroscopy of Biological Tissues , 2007 .