Kernel principal component analysis residual diagnosis (KPCARD): An automated method for cosmic ray artifact removal in Raman spectra.

A new, fully automated, rapid method, referred to as kernel principal component analysis residual diagnosis (KPCARD), is proposed for removing cosmic ray artifacts (CRAs) in Raman spectra, and in particular for large Raman imaging datasets. KPCARD identifies CRAs via a statistical analysis of the residuals obtained at each wavenumber in the spectra. The method utilizes the stochastic nature of CRAs; therefore, the most significant components in principal component analysis (PCA) of large numbers of Raman spectra should not contain any CRAs. The process worked by first implementing kernel PCA (kPCA) on all the Raman mapping data and second accurately estimating the inter- and intra-spectrum noise to generate two threshold values. CRA identification was then achieved by using the threshold values to evaluate the residuals for each spectrum and assess if a CRA was present. CRA correction was achieved by spectral replacement where, the nearest neighbor (NN) spectrum, most spectroscopically similar to the CRA contaminated spectrum and principal components (PCs) obtained by kPCA were both used to generate a robust, best curve fit to the CRA contaminated spectrum. This best fit spectrum then replaced the CRA contaminated spectrum in the dataset. KPCARD efficacy was demonstrated by using simulated data and real Raman spectra collected from solid-state materials. The results showed that KPCARD was fast (<1 min per 8400 spectra), accurate, precise, and suitable for the automated correction of very large (>1 million) Raman datasets.

[1]  Richard L. McCreery,et al.  Raman Spectroscopy for Chemical Analysis , 2000 .

[2]  W. Härdle,et al.  Robust Smoothing Applied to White Noise and Single Outlier Contaminated Raman Spectra , 1984 .

[3]  Richard D. Braatz,et al.  Assessment of Recent Process Analytical Technology (PAT) Trends: A Multiauthor Review , 2015 .

[4]  S. D. Jong,et al.  The kernel PCA algorithms for wide data. Part I: Theory and algorithms , 1997 .

[5]  H. Georg Schulze,et al.  A Two-Dimensionally Coincident Second Difference Cosmic Ray Spike Removal Method for the Fully Automated Processing of Raman Spectra , 2014, Applied spectroscopy.

[6]  Steven E. J. Bell,et al.  Development of sampling methods for Raman analysis of solid dosage forms of therapeutic and illicit drugs , 2004 .

[7]  Yukihiro Ozaki,et al.  Practical Algorithm for Reducing Convex Spike Noises on a Spectrum , 2003, Applied spectroscopy.

[8]  Slobodan Sasić Chemical imaging of pharmaceutical granules by Raman global illumination and near-infrared mapping platforms. , 2008, Analytica chimica acta.

[9]  Qing-Song Xu,et al.  Morphological weighted penalized least squares for background correction. , 2013, The Analyst.

[10]  Jerilyn A. Timlin,et al.  Preprocessing Strategies to Improve MCR Analyses of Hyperspectral Images. , 2012 .

[11]  W. Hill,et al.  Spike-correction of weak signals from charge-coupled devices and its application to Raman spectroscopy , 1992 .

[12]  O. Marjanovic,et al.  Multivariate Statistical Analysis of Raman Images of a Pharmaceutical Tablet , 2012, Applied spectroscopy.

[13]  Alan G. Ryder,et al.  Low-content quantification in powders using Raman spectroscopy: a facile chemometric approach to sub 0.1% limits of detection. , 2015, Analytical chemistry.

[14]  Dor Ben-Amotz,et al.  Stripping of Cosmic Spike Spectral Artifacts Using a New Upper-Bound Spectrum Algorithm , 2001 .

[15]  Peyman Eshghi,et al.  Dimensionality choice in principal components analysis via cross-validatory methods , 2014 .

[16]  Sheng Li,et al.  An Improved Algorithm to Remove Cosmic Spikes in Raman Spectra for Online Monitoring , 2011, Applied spectroscopy.

[17]  Michael D. Morris,et al.  Identification of Outliers in Hyperspectral Raman Image Data by Nearest Neighbor Comparison , 2002 .

[18]  P J Cullen,et al.  Recent applications of Chemical Imaging to pharmaceutical process monitoring and quality control. , 2008, European journal of pharmaceutics and biopharmaceutics : official journal of Arbeitsgemeinschaft fur Pharmazeutische Verfahrenstechnik e.V.

[19]  Wee Chew,et al.  Information‐theoretic chemometric analyses of Raman data for chemical reaction studies , 2011 .

[20]  Giancarlo Fini,et al.  Applications of Raman spectroscopy to pharmacy , 2004 .

[21]  S. Wold,et al.  A PLS kernel algorithm for data sets with many variables and fewer objects. Part 1: Theory and algorithm , 1994 .

[22]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[23]  M. Khan,et al.  Process analytical technology: chemometric analysis of Raman and near infra-red spectroscopic data for predicting physical properties of extended release matrix tablets. , 2007, Journal of pharmaceutical sciences.

[24]  H Georg Schulze,et al.  Automated Estimation of White Gaussian Noise Level in a Spectrum with or without Spike Noise Using a Spectral Shifting Technique , 2006, Applied spectroscopy.

[25]  Lin Zhang,et al.  A Practical Algorithm to Remove Cosmic Spikes in Raman Imaging Data for Pharmaceutical Applications , 2007, Applied spectroscopy.

[26]  Ute B. Cappel,et al.  Removing Cosmic Ray Features from Raman Map Data by a Refined Nearest Neighbor Comparison Method as a Precursor for Chemometric Analysis , 2010, Applied spectroscopy.

[27]  S. Šašiċ,et al.  Raman chemical mapping of low-content active pharmaceutical ingredient formulations. III. Statistically optimized sampling and detection of polymorphic forms in tablets on stability. , 2012, Analytical chemistry.

[28]  Jun Zhao,et al.  Image Curvature Correction and Cosmic Removal for High-Throughput Dispersive Raman Spectroscopy , 2003, Applied spectroscopy.

[29]  A. Bond Polymorphism in molecular crystals , 2009 .

[30]  David M. Haaland,et al.  Partial least-squares methods for spectral analyses. 2. Application to simulated and glass spectral data , 1988 .

[31]  F Ehrentreich,et al.  Spike removal and denoising of Raman spectra by wavelet transform methods. , 2001, Analytical chemistry.

[32]  S. Wold,et al.  The kernel algorithm for PLS , 1993 .

[33]  J. Kauffman,et al.  Screening of heparin API by near infrared reflectance and Raman spectroscopy. , 2009, Journal of pharmaceutical sciences.

[34]  Joel M. Harris,et al.  Polynomial filters for data sets with outlying or missing observations: application to charge-coupled-device-detected Raman spectra contaminated by cosmic rays , 1990 .

[35]  Hideo Takeuchi,et al.  Simple and Efficient Method to Eliminate Spike Noise from Spectra Recorded on Charge-Coupled Device Detectors , 1993 .

[36]  Keith C Gordon,et al.  Raman mapping of pharmaceuticals. , 2011, International journal of pharmaceutics.