Determination of an acceptable level of spectral data compression by discrete wavelet transforms.

Today, due to the ever increasing amount of data generated by analytical instruments, good compression methods are desired to keep computation time acceptable. The lower the volume and noise content of data, the easier it becomes to investigate and interpret the modeling results. Discrete Wavelet Transform (DWT) is an effective data compression and noise suppression tool. Compression can be performed at different levels, in each, the size of signal part of the data reduces to half the size. This work includes an approach for determining an acceptable level of compression of data where the aim is to achieve minimal loss of information and no significant change in the structure of data, which could mean, e.g. no loss in predictive ability or the effective rank of the data-set. The method is based on estimation of the Singular Values (SVs) from a data matrix and the Singular Values at each level of compression followed by the application of Median Absolute Deviation (MAD) of the correlation between original SVs and compression SVs as a simple statistical test for the determination of the optimum level of compression. We illustrate the method using FT-Raman data from aqueous solutions of three sugars (glucose, trehalose and sucrose) and NMR data from mixtures of three alcohols. A sudden change in prediction error sum of square plots from Partial Least Squares (PLS) modeling confirms the results from MAD statistics.

[1]  Ulrich Günther,et al.  Using wavelet de-noised spectra in NMR screening. , 2005, Journal of magnetic resonance.

[2]  Xueguang Shao,et al.  Determination of the component number in overlapping multicomponent chromatogram using wavelet transform , 1998 .

[3]  R. Bro,et al.  Quantitative analysis of NMR spectra with chemometrics. , 2008, Journal of magnetic resonance.

[4]  D. Massart,et al.  Application of Wavelet Packet Transform in Pattern Recognition of Near-IR Data , 1996 .

[5]  Edmund R. Malinowski,et al.  Determination of rank by median absolute deviation (DRMAD): a simple method for determining the number of principal factors responsible for a data matrix , 2009 .

[6]  R. Bonner,et al.  Application of wavelet transforms to experimental spectra : Smoothing, denoising, and data set compression , 1997 .

[7]  Xueguang Shao,et al.  Resolution of the NMR Spectrum Using Wavelet Transform , 2000 .

[8]  D. Kell,et al.  An introduction to wavelet transforms for chemometricians: A time-frequency approach , 1997 .

[9]  Limin Shao,et al.  A WAVELET TRANSFORM AND ITS APPLICATION TO SPECTROSCOPIC ANALYSIS , 2002 .

[10]  Huwei Tan,et al.  Wavelet hybrid direct standardization of near‐infrared multivariate calibrations , 2001 .

[11]  Douglas B. Kell,et al.  Wavelet Denoising of Infrared Spectra , 1997 .

[12]  Salvatore Daniele,et al.  Determination of Lead and Copper in Wine by Anodic Stripping Voltammetry With Mercury Microelectrodes: Assessment of the Influence of Sample Pretreatment Procedures , 1997 .

[13]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Alexander Kai-man Leung,et al.  Wavelet: a new trend in chemistry. , 2003, Accounts of chemical research.

[15]  Pablo G. Tahoces,et al.  Compression of high resolution 1D and 2D NMR data sets using JPEG2000 , 2008 .

[16]  Desire L. Massart,et al.  Noise suppression and signal compression using the wavelet packet transform , 1997 .

[17]  M. Bos,et al.  The wavelet transform for pre-processing IR spectra in the identification of mono- and di-substituted benzenes , 1994 .

[18]  Hein Putter,et al.  The bootstrap: a tutorial , 2000 .

[19]  Zhou Wang,et al.  Feature selection and classification of high-resolution NMR spectra in the complex wavelet transform domain , 2008 .

[20]  Consuelo Pizarro,et al.  Generalization of OWAVEC method for simultaneous noise suppression, data compression and orthogonal signal correction , 2005 .

[21]  Zhanxia Zhang,et al.  Application of wavelet transform to background correction in inductively coupled plasma atomic emission spectrometry , 2003 .

[22]  T. Hancock,et al.  Bagged super wavelets reduction for boosted prostate cancer classification of seldi-tof mass spectral serum profiles , 2006 .

[23]  J. Miller,et al.  Statistics and chemometrics for analytical chemistry , 2005 .

[24]  S. Wold,et al.  PLS regression on wavelet compressed NIR spectra , 1998 .

[25]  Andrew G. Glen,et al.  APPL , 2001 .