Region-based statistical analysis of 2D PAGE images

A new comprehensive procedure for statistical analysis of two-dimensional polyacrylamide gel electrophoresis (2D PAGE) images is proposed, including protein region quantification, normalization and statistical analysis. Protein regions are defined by the master watershed map that is obtained from the mean gel. By working with these protein regions, the approach bypasses the current bottleneck in the analysis of 2D PAGE images: it does not require spot matching. Background correction is implemented in each protein region by local segmentation. Two-dimensional locally weighted smoothing (LOESS) is proposed to remove any systematic bias after quantification of protein regions. Proteins are separated into mutually independent sets based on detected correlations, and a multivariate analysis is used on each set to detect the group effect. A strategy for multiple hypothesis testing based on this multivariate approach combined with the usual Benjamini-Hochberg FDR procedure is formulated and applied to the differential analysis of 2D PAGE images. Each step in the analytical protocol is shown by using an actual dataset. The effectiveness of the proposed methodology is shown using simulated gels in comparison with the commercial software packages PDQuest and Dymension. We also introduce a new procedure for simulating gel images.

[1]  Feng Li Empirical Bayes methods for proteomics , 2007 .

[2]  Luc Vincent,et al.  Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Babu Raman,et al.  Analyzing Two-Dimensional Gel Images , 2003 .

[4]  Knut Conradsen,et al.  Analysis of Two-Dimensional Electrophoretic Gels , 1992 .

[5]  William S. Cleveland,et al.  A Model for Studying Display Methods of Statistical Graphics , 1993 .

[6]  Jeffrey S. Morris,et al.  Pinnacle: a fast, automatic and accurate method for detecting and quantifying protein spots in 2-dimensional gel electrophoresis data , 2008, Bioinform..

[7]  B. Efron Robbins, Empirical Bayes, And Microarrays , 2001 .

[8]  B. Efron Correlation and Large-Scale Simultaneous Significance Testing , 2007 .

[9]  J. Nishihara,et al.  Quantitative evaluation of proteins in one‐ and two‐dimensional polyacrylamide gels using a fluorescent stain , 2002, Electrophoresis.

[10]  P. O’Farrell High resolution two-dimensional electrophoresis of proteins. , 1975, The Journal of biological chemistry.

[11]  Feng Li,et al.  Differential analysis of 2D gel images. , 2011, Methods in Enzymology.

[12]  B. Efron Size, power and false discovery rates , 2007, 0710.2245.

[13]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[14]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[15]  B. Efron Large-Scale Simultaneous Hypothesis Testing , 2004 .

[16]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[17]  Xing Qiu,et al.  Correlation Between Gene Expression Levels and Limitations of the Empirical Bayes Methodology for Finding Differentially Expressed Genes , 2005, Statistical applications in genetics and molecular biology.

[18]  Robert Tibshirani,et al.  Correlation-sharing for detection of differential gene expression , 2006, math/0608061.

[19]  Anindya Roy,et al.  Protein Image Alignment via Piecewise Affine Transformations , 2006, J. Comput. Biol..