Multiscale Denoising of Biological Data: A Comparative Analysis

Measured microarray genomic and metabolic data are a rich source of information about the biological systems they represent. For example, time-series biological data can be used to construct dynamic genetic regulatory network models, which can be used to design intervention strategies to cure or manage major diseases. Also, copy number data can be used to determine the locations and extent of aberrations in chromosome sequences. Unfortunately, measured biological data are usually contaminated with errors that mask the important features in the data. Therefore, these noisy measurements need to be filtered to enhance their usefulness in practice. Wavelet-based multiscale filtering has been shown to be a powerful denoising tool. In this work, different batch as well as online multiscale filtering techniques are used to denoise biological data contaminated with white or colored noise. The performances of these techniques are demonstrated and compared to those of some conventional low-pass filters using two case studies. The first case study uses simulated dynamic metabolic data, while the second case study uses real copy number data. Simulation results show that significant improvement can be achieved using multiscale filtering over conventional filtering techniques.

[1]  Mike Carson,et al.  Wavelets and molecular structure , 1996, J. Comput. Aided Mol. Des..

[2]  Bin Yu \comprestimation": Microarray Images in Abundance , 2000 .

[3]  Gilbert Strang,et al.  Wavelets and Dilation Equations: A Brief Introduction , 1989, SIAM Rev..

[4]  Prospero C. Naval,et al.  Parameter estimation using Simulated Annealing for S-system models of biochemical networks , 2007, Bioinform..

[5]  Ajay N. Jain,et al.  Assembly of microarrays for genome-wide measurement of DNA copy number , 2001, Nature Genetics.

[6]  F J Ayala,et al.  A new method for characterizing replacement rate variation in molecular sequences. Application of the Fourier and wavelet models to Drosophila and mammalian proteins. , 2000, Genetics.

[7]  D. Donoho,et al.  Translation-Invariant De-Noising , 1995 .

[8]  Pietro Liò,et al.  Wavelets in bioinformatics and computational biology: state of art and perspectives , 2003, Bioinform..

[9]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Eberhard O Voit,et al.  Controllability of non-linear biochemical systems. , 2005, Mathematical biosciences.

[11]  M F Shlesinger,et al.  Mode matches in hydrophobic free energy eigenfunctions predict peptide-protein interactions. , 1998, Biopolymers.

[12]  Denise Gorse,et al.  Wavelet transforms for the characterization and detection of repeating motifs. , 2002, Journal of molecular biology.

[13]  Heng Huang,et al.  Stationary Wavelet Packet Transform and Dependent Laplacian Bivariate Shrinkage Estimator for Array-CGH Data Smoothing , 2010, J. Comput. Biol..

[14]  I. Daubechies,et al.  Wavelets on the Interval and Fast Wavelet Transforms , 1993 .

[15]  Bhavik R. Bakshi,et al.  Multiscale Methods for Denoising and Compression , 2000 .

[16]  D. L. Donoho,et al.  Ideal spacial adaptation via wavelet shrinkage , 1994 .

[17]  J Wilson,et al.  Low-resolution phase extension using wavelet analysis. , 2000, Acta crystallographica. Section D, Biological crystallography.

[18]  B. Bakshi,et al.  On-line multiscale filtering of random and gross errors without process models , 1999 .

[19]  P. Siddaiah,et al.  A New Wavelet Based Method for Denoising of Biological Signals , 2008 .

[20]  M. T. Tham,et al.  Succeed at on-line validation and reconstruction of data , 1994 .

[21]  A Antoniadis,et al.  Data compression for diffraction patterns. , 1998, Acta crystallographica. Section D, Biological crystallography.

[22]  Robert D. Strum,et al.  First principles of discrete systems and digital signal processing , 1988 .

[23]  James B. Rawlings,et al.  Particle filtering and moving horizon estimation , 2006, Comput. Chem. Eng..

[24]  Jonas S. Almeida,et al.  Decoupling dynamical systems for pathway identification from metabolic profiles , 2004, Bioinform..

[25]  John Reinitz,et al.  Registration of the expression patterns of Drosophila segmentation genes by two independent methods , 2001, Bioinform..

[26]  Arthur W. Toga,et al.  A wavelet-based statistical analysis of fMRI data , 2007, Neuroinformatics.

[27]  P. Vandergheynst,et al.  Fourier and wavelet transform analysis, a tool for visualizing regular patterns in DNA sequences. , 2000, Journal of theoretical biology.

[28]  P. Bahr,et al.  Sampling: Theory and Applications , 2020, Applied and Numerical Harmonic Analysis.

[29]  H. W. Sorenson,et al.  Kalman filtering : theory and application , 1985 .

[30]  G. Nason Wavelet Shrinkage using Cross-validation , 1996 .

[31]  Pietro Liò,et al.  Wavelet change-point prediction of transmembrane proteins , 2000, Bioinform..

[32]  Andrey Rzhetsky,et al.  Markov Chain Monte Carlo Computation of Confidence Intervals for Substitution-Rate Variation in Proteins , 2000, Pacific Symposium on Biocomputing.

[33]  Hazem N. Nounou,et al.  Intervention in Biological Phenomena Modeled by S-Systems , 2011, IEEE Transactions on Biomedical Engineering.

[34]  Eberhard O Voit,et al.  Theoretical Biology and Medical Modelling , 2022 .

[35]  Satoru Kuhara,et al.  The hydrophobic cores of proteins predicted by wavelet analysis , 1999, Bioinform..

[36]  A.H. Tewfik,et al.  DNA Copy Number Detection and Sigma Filter , 2007, 2007 IEEE International Workshop on Genomic Signal Processing and Statistics.

[37]  Emmanuel Bacry,et al.  What can we learn with wavelets about DNA sequences , 1998 .

[38]  Heng Huang,et al.  Array CGH data modeling and smoothing in Stationary Wavelet Packet Transform domain , 2008, BMC Genomics.

[39]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[40]  Hazem N. Nounou,et al.  Parameter estimation of biological phenomena modeled by S-systems: An Extended Kalman filter approach , 2011, IEEE Conference on Decision and Control and European Control Conference.

[41]  Robert R. Klevecz,et al.  Dynamic architecture of the yeast cell cycle uncovered by wavelet decomposition of expression microarray data , 2000, Functional & Integrative Genomics.

[42]  Bhavik R. Bakshi,et al.  Multiscale analysis and modeling using wavelets , 1999 .

[43]  Aaron F. Bobick,et al.  Multiscale 3-D Shape Representation and Segmentation Using Spherical Wavelets , 2007, IEEE Transactions on Medical Imaging.

[44]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[45]  Edward R. Dougherty,et al.  Inference of Gene Regulatory Networks using S-System: A Unified Approach , 2007 .

[46]  I. Johnstone,et al.  Ideal denoising in an orthonormal basis chosen from a library of bases , 1994 .

[47]  Emmanuel Bacry,et al.  Wavelet based fractal analysis of DNA sequences , 1996 .

[48]  Z. Kutalik,et al.  S-system parameter estimation for noisy metabolic profiles using newton-flow analysis. , 2007, IET systems biology.

[49]  I. Daubechies Orthonormal bases of compactly supported wavelets , 1988 .