Wavelet Bases for IR Library Compression, Searching and Reconstruction

This chapter discusses wavelet bases for IR library compression, searching, and reconstruction. Most of the commercial databases to date use Fast Fourier Transform (FFT) for spectra compression. The IR spectra show many absorption bands of local character, which makes wavelets well suited for their decomposition. Also, Wavelet Transform is quite faster than FFT. During successful library construction three important factors must be considered— namely, efficient compression algorithm and ratio, fast search speed method, and good spectra reconstruction quality. Fulfillment of all these demands requires some kind of compromise and there are different possible approaches to this problem. Wavelets, well localized in both time and frequency (scale) domains are basis functions ideally suited for description of the unstationary instrumental signals, such as IR or NMR spectra. In addition, signals, such as IR spectra, have in a wavelet domain sparse representation. It means that in wavelet domain there are many wavelet coefficients with small amplitude (absolute value), which can be discarded without loss of essential information carried by a signal. Elimination of small coefficients is equivalent to spectra compression.

[1]  Desire L. Massart,et al.  Noise suppression and signal compression using the wavelet packet transform , 1997 .

[2]  Robert D. Clark,et al.  Virtual Compound Libraries: A New Approach to Decision Making in Molecular Discovery Research , 1998, J. Chem. Inf. Comput. Sci..

[3]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Jorma Rissanen,et al.  Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.

[5]  M. Victor Wickerhauser,et al.  Adapted wavelet analysis from theory to software , 1994 .

[6]  I. Daubechies Orthonormal bases of compactly supported wavelets , 1988 .

[7]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[8]  Naoki Saito,et al.  Simultaneous noise suppression and signal compression using a library of orthonormal bases and the minimum-description-length criterion , 1994, Defense, Security, and Sensing.

[9]  David L. Donoho,et al.  De-noising by soft-thresholding , 1995, IEEE Trans. Inf. Theory.

[10]  Peter Willett,et al.  EVA: A Novel Theoretical Descriptor for QSAR Studies , 2002 .

[11]  Ronald R. Coifman,et al.  Entropy-based algorithms for best basis selection , 1992, IEEE Trans. Inf. Theory.

[12]  Jorma Rissanen,et al.  Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.

[13]  Allan M. Ferguson,et al.  EVA: A new theoretically based molecular descriptor for use in QSAR/QSPR analysis , 1997, J. Comput. Aided Mol. Des..

[14]  Peter Willett,et al.  Similarity Searching in Files of Three-Dimensional Chemical Structures: Evaluation of the EVA Descriptor and Combination of Rankings Using Data Fusion , 1997, J. Chem. Inf. Comput. Sci..

[15]  D. Massart,et al.  Application of Wavelet Packet Transform in Pattern Recognition of Near-IR Data , 1996 .