Feature Detection Techniques for Preprocessing Proteomic Data

Numerous gel-based and nongel-based technologies are used to detect protein changes potentially associated with disease. The raw data, however, are abundant with technical and structural complexities, making statistical analysis a difficult task. Low-level analysis issues (including normalization, background correction, gel and/or spectral alignment, feature detection, and image registration) are substantial problems that need to be addressed, because any large-level data analyses are contingent on appropriate and statistically sound low-level procedures. Feature detection approaches are particularly interesting due to the increased computational speed associated with subsequent calculations. Such summary data corresponding to image features provide a significant reduction in overall data size and structure while retaining key information. In this paper, we focus on recent advances in feature detection as a tool for preprocessing proteomic data. This work highlights existing and newly developed feature detection algorithms for proteomic datasets, particularly relating to time-of-flight mass spectrometry, and two-dimensional gel electrophoresis. Note, however, that the associated data structures (i.e., spectral data, and images containing spots) used as input for these methods are obtained via all gel-based and nongel-based methods discussed in this manuscript, and thus the discussed methods are likewise applicable.

[1]  J. Yates,et al.  Tech insight. MudPIT: Multidimensional protein identification technology. , 2007 .

[2]  Kimberly F. Sellers,et al.  Lights, Camera, Action! Systematic variation in 2‐D difference gel electrophoresis images , 2007, Electrophoresis.

[3]  Arlan Richardson,et al.  Processing of data generated by 2-dimensional gel electrophoresis for statistical analysis: missing data, normalization, and statistics. , 2004, Journal of proteome research.

[4]  Jan Giebel,et al.  Shape-based pedestrian detection and tracking , 2002, Intelligent Vehicle Symposium, 2002. IEEE.

[5]  Kimberly F. Sellers,et al.  Multidimensional Median Filters for Finding Bumps , 2009 .

[6]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[7]  D. Chan,et al.  Evaluation of serum protein profiling by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry for the detection of prostate cancer: I. Assessment of platform reproducibility. , 2005, Clinical chemistry.

[8]  Ruedi Aebersold,et al.  The Application of New Software Tools to Quantitative Protein Profiling Via Isotope-coded Affinity Tag (ICAT) and Tandem Mass Spectrometry , 2003, Molecular & Cellular Proteomics.

[9]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[10]  John R Yates,et al.  MudPIT: multidimensional protein identification technology. , 2007, BioTechniques.

[11]  Jeffrey S. Morris,et al.  Quality control and peak finding for proteomics data collected from nipple aspirate fluid by surface-enhanced laser desorption and ionization. , 2003, Clinical chemistry.

[12]  Enrico Capobianco,et al.  Empowering Spot Detection in 2DE Images by Wavelet Denoising , 2009, Silico Biol..

[13]  R. Aebersold,et al.  The Application of New Software Tools to Quantitative Protein Profiling Via Isotope-coded Affinity Tag (ICAT) and Tandem Mass Spectrometry , 2003, Molecular & Cellular Proteomics.

[14]  Pan Du,et al.  Bioinformatics Original Paper Improved Peak Detection in Mass Spectrum by Incorporating Continuous Wavelet Transform-based Pattern Matching , 2022 .

[15]  Shinto Eguchi,et al.  Identification of biomarkers from mass spectrometry data using a "common" peak approach , 2006, BMC Bioinformatics.

[16]  Jeffrey S. Morris,et al.  Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum , 2005, Bioinform..

[17]  Jean Serra,et al.  Image Analysis and Mathematical Morphology , 1983 .

[18]  M. MacCoss,et al.  High-speed data reduction, feature detection, and MS/MS spectrum quality assessment of shotgun proteomics data sets using high-resolution mass spectrometry. , 2007, Analytical chemistry.

[19]  Olivier Langella,et al.  A method based on bead flows for spot detection on 2‐D gel images , 2008, Proteomics.

[20]  A. Rustgi,et al.  Characterization of proteins in human pancreatic cancer serum using differential gel electrophoresis and tandem mass spectrometry. , 2005, Journal of proteome research.

[21]  M. Ünlü,et al.  Difference gel electrophoresis. A single gel method for detecting changes in protein extracts , 1997, Electrophoresis.

[22]  J. Potter,et al.  A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection. , 2003, Biostatistics.

[23]  Jeffrey S. Morris,et al.  Improved peak detection and quantification of mass spectrometry data acquired from surface‐enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform , 2005, Proteomics.

[24]  Jörg Rahnenführer,et al.  Robert Gentleman, Vincent Carey, Wolfgang Huber, Rafael Irizarry, Sandrine Dudoit (2005): Bioinformatics and Computational Biology Solutions Using R and Bioconductor , 2009 .

[25]  Ruedi Aebersold,et al.  Challenges and Opportunities in Proteomics Data Analysis* , 2006, Molecular & Cellular Proteomics.

[26]  Xinhua Zhuang,et al.  Morphological structuring element decomposition , 1986 .

[27]  Jeffrey S. Morris,et al.  Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments , 2004, Bioinform..

[28]  Francesca Antonucci,et al.  Numerical approaches for quantitative analysis of two‐dimensional maps: A review of commercial software and home‐made systems , 2005, Proteomics.

[29]  Chandra Kambhamettu,et al.  An image analysis suite for spot detection and spot matching in two‐dimensional electrophoresis gels , 2008, Electrophoresis.

[30]  Chin-Seng Chua,et al.  Facial feature detection and face recognition from 2D and 3D images , 2002, Pattern Recognit. Lett..

[31]  P. Brown,et al.  A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization. , 1996, Genome research.

[32]  J. Albar,et al.  Differential proteomics: an overview of gel and non-gel based approaches. , 2004, Briefings in functional genomics & proteomics.

[33]  Graham B. I. Scott,et al.  HUPO Plasma Proteome Project specimen collection and handling: Towards the standardization of parameters for plasma proteome samples , 2005, Proteomics.

[34]  S. Gygi,et al.  Quantitative analysis of complex protein mixtures using isotope-coded affinity tags , 1999, Nature Biotechnology.

[35]  Pierre Soille,et al.  Morphological Image Analysis: Principles and Applications , 2003 .

[36]  P. O’Farrell High resolution two-dimensional electrophoresis of proteins. , 1975, The Journal of biological chemistry.

[37]  Ilya Levner,et al.  Feature selection and nearest centroid classification for protein mass spectrometry , 2005, BMC Bioinformatics.

[38]  Rafael A. Irizarry,et al.  Bioinformatics and Computational Biology Solutions using R and Bioconductor , 2005 .

[39]  Tanasit Techanukul,et al.  Comparison of three commercially available DIGE analysis software packages: minimal user intervention in gel-based proteomics. , 2009, Journal of proteome research.

[40]  Tero Aittokallio,et al.  Comparison of PDQuest and Progenesis software packages in the analysis of two‐dimensional electrophoresis gels , 2003, Proteomics.

[41]  Petros Maragos,et al.  Tutorial On Advances In Morphological Image Processing And Analysis , 1986, Other Conferences.

[42]  Guanghui Wang,et al.  Comparative study of three proteomic quantitative methods, DIGE, cICAT, and iTRAQ, using 2D gel- or LC-MALDI TOF/TOF. , 2006, Journal of Proteome Research.

[43]  Michael Brady,et al.  Novelty detection for the identification of masses in mammograms , 1995 .

[44]  S. Dudoit,et al.  Multiple Testing Procedures with Applications to Genomics , 2007 .

[45]  Babu Raman,et al.  Quantitative comparison and evaluation of two commercially available, two‐dimensional electrophoresis image analysis software packages, Z3 and Melanie , 2002, Electrophoresis.

[46]  Helen Kim,et al.  The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins , 2005, BMC biotechnology.

[47]  E. Hoffmann Tandem mass spectrometry: A primer , 1996 .

[48]  Eli Saber,et al.  Frontal-view face detection and facial feature extraction using color, shape and symmetry based cost functions , 1998, Pattern Recognit. Lett..

[49]  Trong Khoa Pham,et al.  Technical, experimental, and biological variations in isobaric tags for relative and absolute quantitation (iTRAQ). , 2007, Journal of proteome research.