Removing Cosmic Ray Features from Raman Map Data by a Refined Nearest Neighbor Comparison Method as a Precursor for Chemometric Analysis

An algorithm to remove cosmic ray (CR) features from Raman spectra collected in mapping experiments using a charge-coupled device (CCD) is presented. Each spectrum is compared to spectra collected from adjacent points in space using correlation values. The most similar neighbor (MSN) spectrum is selected, offset, and used for identification of CRs. The offset values are defined in terms of the noise level for data with a low signal-to-noise ratio and in terms of the peak height for data with a high signal-to-noise ratio. Scaled intensity values of the MSN spectra are used for replacement of contaminated pixels, allowing for full recovery of underlying spectral features. The algorithm is applicable for any Raman map where the particle sizes within the analyzed mixture are larger than the sampling size or to any other data where the sampling is more frequent than the variation, e.g., time series or temperature profiles. Its application to several maps of pharmaceutical samples is discussed here. With an appropriate offset value for the MSN spectra, no misdetections occur, and all CRs more intense than the offset are removed, which includes the CRs that would have hampered subsequent chemometric analysis by methods such as principal component analysis (PCA).