The pitfalls of proteomics experiments without the correct use of bioinformatics tools

The elucidation of the entire genomic sequence of various organisms, from viruses to complex metazoans, most recently man, is undoubtedly the greatest triumph of molecular biology since the discovery of the DNA double helix. Over the past two decades, the focus of molecular biology has gradually moved from genomes to proteomes, the intention being to discover the functions of the genes themselves. The postgenomic era stimulated the development of new techniques (e.g. 2‐DE and MS) and bioinformatics tools to identify the functions, reactions, interactions and location of the gene products in tissues and/or cells of living organisms. Both 2‐DE and MS have been very successfully employed to identify proteins involved in biological phenomena (e.g. immunity, cancer, host–parasite interactions, etc.), although recently, several papers have emphasised the pitfalls of 2‐DE experiments, especially in relation to experimental design, poor statistical treatment and the high rate of ‘false positive’ results with regard to protein identification. In the light of these perceived problems, we review the advantages and misuses of bioinformatics tools – from realisation of 2‐DE gels to the identification of candidate protein spots – and suggest some useful avenues to improve the quality of 2‐DE experiments. In addition, we present key steps which, in our view, need to be to taken into consideration during such analyses. Lastly, we present novel biological entities named ‘interactomes', and the bioinformatics tools developed to analyse the large protein–protein interaction networks they form, along with several new perspectives of the field.

[1]  Jonathan C. Roberts,et al.  The Craft of Information Visualization , 2008 .

[2]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[3]  R. Appel,et al.  Guidelines for the next 10 years of proteomics , 2009, Proteomics.

[4]  A. S. Juncker,et al.  A wiring of the human nucleolus. , 2006, Molecular cell.

[5]  G. Mazzucchelli,et al.  Proteomics in Myzus persicae: effect of aphid host plant switch. , 2006, Insect biochemistry and molecular biology.

[6]  Susumu Goto,et al.  Effects of post-electrophoretic analysis on variance in gel-based proteomics , 2006, Expert review of proteomics.

[7]  Alain Guénoche,et al.  PRODISTIN Web Site: a tool for the functional classification of proteins from interaction networks , 2006, Bioinform..

[8]  B. Berger,et al.  Herpesviral Protein Networks and Their Interaction with the Human Proteome , 2006, Science.

[9]  iVici: Interrelational Visualization and Correlation Interface , 2005, Genome Biology.

[10]  M. Vignali,et al.  A protein interaction network of the malaria parasite Plasmodium falciparum , 2005, Nature.

[11]  Daniel A. Schaeffer,et al.  Error‐tolerant EST database searches by tandem mass spectrometry and multiTag software , 2005, Proteomics.

[12]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[13]  David O. Nelson,et al.  Statistical challenges in the analysis of two-dimensional difference gel electrophoresis experiments using DeCyderTM , 2005, Bioinform..

[14]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.

[15]  Marc Vidal,et al.  Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis , 2005, Nature.

[16]  Kathryn S Lilley,et al.  Maximising sensitivity for detecting changes in protein expression: Experimental design using minimal CyDyes , 2005, Proteomics.

[17]  C. Glover,et al.  Gene expression profiling for hematopoietic cell culture , 2006 .

[18]  Brigitte Picard,et al.  Data analysis methods for detection of differential protein expression in two-dimensional gel electrophoresis. , 2005, Analytical biochemistry.

[19]  M. Wilkins,et al.  Optimal replication and the importance of experimental design for gel-based quantitative proteomics. , 2005, Journal of proteome research.

[20]  大房 健 基礎講座 電気泳動(Electrophoresis) , 2005 .

[21]  C. G. Black,et al.  Invited reviewParasite genomes , 2005 .

[22]  J. Barrett,et al.  Analysing proteomic data. , 2005, International journal for parasitology.

[23]  D. Biron,et al.  Towards a new conceptual approach to "parasitoproteomics". , 2005, Trends in parasitology.

[24]  D. Biron,et al.  The proteomics: a new prospect for studying parasitic manipulation , 2005, Behavioural Processes.

[25]  R. Chanet,et al.  Protein interaction mapping: a Drosophila case study. , 2005, Genome research.

[26]  Francesca Antonucci,et al.  Numerical approaches for quantitative analysis of two‐dimensional maps: A review of commercial software and home‐made systems , 2005, Proteomics.

[27]  Ian M. Donaldson,et al.  The Biomolecular Interaction Network Database and related tools 2005 update , 2004, Nucleic Acids Res..

[28]  Leon Goldovsky,et al.  BioLayout(Java): versatile network visualisation of structural and functional relationships. , 2005, Applied bioinformatics.

[29]  N. Karp,et al.  Application of partial least squares discriminant analysis to two‐dimensional difference gel studies in expression proteomics , 2005, Proteomics.

[30]  G. Spicer Molecular evolution among someDrosophila species groups as indicated by two-dimensional electrophoresis , 2005, Journal of Molecular Evolution.

[31]  BMC Bioinformatics , 2005 .

[32]  Frank Dudbridge,et al.  The Use of Edge-Betweenness Clustering to Investigate Biological Function in Protein Interaction Networks , 2005, BMC Bioinformatics.

[33]  M. Rudemo,et al.  Statistical exploration of variation in quantitative two‐dimensional gel electrophoresis data , 2004, Proteomics.

[34]  U. Theopold,et al.  Proteomics of the Drosophila immune response. , 2004, Trends in biotechnology.

[35]  Claude Pasquier,et al.  THEA: ontology-driven analysis of microarray data , 2004, Bioinform..

[36]  Barbara A. Wetmore,et al.  Toxicoproteomics: proteomics applied to toxicology and pathology. , 2004, Toxicologic pathology.

[37]  Alain Guénoche,et al.  Clustering proteins from interaction networks for the prediction of cellular functions , 2004, BMC Bioinformatics.

[38]  Martin Ostrowski,et al.  Cross‐species identification of proteins from proteome profiles of the marine oligotrophic ultramicrobacterium, Sphingopyxis alaskensis , 2004, Proteomics.

[39]  François Chevalier,et al.  Proteomic capacity of recent fluorescent dyes for protein staining. , 2004, Phytochemistry.

[40]  Ruedi Aebersold,et al.  The Need for Guidelines in Publication of Peptide and Protein Identification Data , 2004, Molecular & Cellular Proteomics.

[41]  David P. Kreil,et al.  Determining a significant change in protein expression with DeCyder™ during a pair‐wise comparison using two‐dimensional difference gel electrophoresis , 2004, Proteomics.

[42]  Thomas H Hutchinson,et al.  Ecotoxicogenomics: the challenge of integrating genomics into aquatic and terrestrial ecotoxicology. , 2004, Aquatic toxicology.

[43]  S. Przyborski,et al.  Proteomic identification of biomarkers expressed by human pluripotent stem cells. , 2004, Biochemical and biophysical research communications.

[44]  Kathryn S. Lilley,et al.  DNA microarray normalization methods can remove bias from differential protein expression analysis of 2D difference gel electrophoresis results , 2004, Bioinform..

[45]  A. Shevchenko,et al.  The Power and the Limitations of Cross-Species Protein Identification by Mass Spectrometry-driven Sequence Similarity Searches*S , 2004, Molecular & Cellular Proteomics.

[46]  R. Aebersold,et al.  Analysis, statistical validation and dissemination of large-scale proteomics datasets generated by tandem MS. , 2004, Drug discovery today.

[47]  L. Arckens,et al.  Fluorescent two-dimensional difference gel electrophoresis unveils the potential of gel-based proteomics. , 2004, Current opinion in biotechnology.

[48]  S. L. Wong,et al.  A Map of the Interactome Network of the Metazoan C. elegans , 2004, Science.

[49]  D. Bronson,et al.  Analyses of mouse and Drosophila proteins by two-dimensional gel electrophoresis , 1979, Molecular and General Genetics MGG.

[50]  Alain Guénoche,et al.  The Use of Protein—protein Interaction Networks for Genome Wide Protein Function Comparisons and Predictions , 2004 .

[51]  Martin Vingron,et al.  IntAct: an open source molecular interaction database , 2004, Nucleic Acids Res..

[52]  David Martin,et al.  Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network , 2003, Genome Biology.

[53]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[54]  Peter Karuso,et al.  A fluorescent natural product for ultra sensitive detection of proteins in one‐dimensional and two‐dimensional gel electrophoresis , 2003, Proteomics.

[55]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[56]  M. Samanta,et al.  Predicting protein functions from redundancies in large-scale protein interaction networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[57]  Tero Aittokallio,et al.  Comparison of PDQuest and Progenesis software packages in the analysis of two‐dimensional electrophoresis gels , 2003, Proteomics.

[58]  Hanno Steen,et al.  Development of human protein reference database as an initial platform for approaching systems biology in humans. , 2003, Genome research.

[59]  Mark P Molloy,et al.  Overcoming technical variation and biological variation in quantitative proteomics , 2003, Proteomics.

[60]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[61]  John T. Stults,et al.  Protein identification: The origins of peptide mass fingerprinting , 2003, Journal of the American Society for Mass Spectrometry.

[62]  Emilio Marengo,et al.  New approach based on fuzzy logic and principal component analysis for the classification of two-dimensional maps in health and disease. Application to lymphomas. , 2003, Journal of chromatography. A.

[63]  Jim Graham,et al.  Statistical models of shape for the analysis of protein spots in two‐dimensional electrophoresis gel images , 2003, Proteomics.

[64]  Jim Graham,et al.  Using statistical image models for objective evaluation of spot detection in two‐dimensional gels , 2003, Proteomics.

[65]  Xianquan Zhan,et al.  Differences in the spatial and quantitative reproducibility between two second‐dimensional gel electrophoresis systems , 2003, Electrophoresis.

[66]  Sarka Beranova-Giorgianni,et al.  Proteome analysis by two-dimensional gel electrophoresis and mass spectrometry: strengths and limitations , 2003 .

[67]  J. Logsdon,et al.  Much ado about bacteria-to-vertebrate lateral gene transfer. , 2003, Trends in genetics : TIG.

[68]  Mark S. Boguski,et al.  Biomedical informatics for proteomics , 2003, Nature.

[69]  T. Earnest,et al.  From words to literature in structural proteomics , 2003, Nature.

[70]  M. Tyers,et al.  From genomics to proteomics , 2003, Nature.

[71]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[72]  Alexander Rives,et al.  Modular organization of cellular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[73]  A. Shevchenko,et al.  Expanding the organismal scope of proteomics: Cross‐species protein identification by mass spectrometry and its implications , 2003, Proteomics.

[74]  S. Hanash Disease proteomics : Proteomics , 2003 .

[75]  Richard J. Simpson,et al.  Proteins and proteomics : a laboratory manual , 2003 .

[76]  Valentina Gianotti,et al.  A new integrated statistical approach to the diagnostic use of two‐dimensional maps , 2003, Electrophoresis.

[77]  M. Tyers,et al.  Osprey: a network visualization system , 2003, Genome Biology.

[78]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[79]  Simon J Hubbard,et al.  Comparative bioinformatic analysis of complete proteomes and protein parameters for cross‐species identification in proteomics , 2002, Proteomics.

[80]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[81]  Ulrike Mathesius,et al.  Evaluation of proteome reference maps for cross‐species identification of proteins by peptide mass fingerprinting , 2002, Proteomics.

[82]  J. Nishihara,et al.  Quantitative evaluation of proteins in one‐ and two‐dimensional polyacrylamide gels using a fluorescent stain , 2002, Electrophoresis.

[83]  Babu Raman,et al.  Quantitative comparison and evaluation of two commercially available, two‐dimensional electrophoresis image analysis software packages, Z3 and Melanie , 2002, Electrophoresis.

[84]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[85]  Ben Shneiderman,et al.  Interactively Exploring Hierarchical Clustering Results , 2002, Computer.

[86]  Wayne F. Patton,et al.  An improved formulation of SYPRO Ruby protein gel stain: Comparison with the original formulation and with a ruthenium II tris (bathophenanthroline disulfonate) formulation , 2002, Proteomics.

[87]  Sandy Kennedy,et al.  Toxicoproteomics -- a new preclinical tool. , 2002, Drug discovery today.

[88]  T. Rabilloud Two‐dimensional gel electrophoresis in proteomics: Old, old fashioned, but it still climbs up the mountains , 2002, Proteomics.

[89]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[90]  M. Vihinen Bioinformatics in proteomics. , 2001, Biomolecular engineering.

[91]  W. Blackstock,et al.  Matching peptide mass spectra to EST and genomic DNA databases. , 2001, Trends in biotechnology.

[92]  William Stafford Noble,et al.  Analysis of strain and regional variation in gene expression in mouse brain , 2001, Genome Biology.

[93]  Alexander Pertsemlidis,et al.  Having a BLAST with bioinformatics (and avoiding BLASTphemy) , 2001, Genome Biology.

[94]  M J Dunn,et al.  Zooming‐in on the proteome: Very narrow‐range immobilised pH gradients reveal more protein species and isoforms , 2001, Electrophoresis.

[95]  P Dupree,et al.  Quantitative and reproducible two‐dimensional gel analysis using Phoretix 2D Full , 2001, Electrophoresis.

[96]  P. Ashton,et al.  Linking proteome and genome: how to identify parasite proteins. , 2001, Trends in parasitology.

[97]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[98]  Matthew Davison,et al.  Validation and development of fluorescence two‐dimensional differential gel electrophoresis proteomics technology , 2001, Proteomics.

[99]  J Barrett,et al.  Thermal hysteresis proteins. , 2001, The international journal of biochemistry & cell biology.

[100]  J. Wojcik,et al.  The protein–protein interaction map of Helicobacter pylori , 2001, Nature.

[101]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[102]  M. Molloy,et al.  Two-dimensional electrophoresis of membrane proteins using immobilized pH gradients. , 2000, Analytical biochemistry.

[103]  Wayne F. Patton,et al.  A thousand points of light: The application of fluorescence detection technologies to two‐dimensional gel electrophoresis and proteomics , 2000, Electrophoresis.

[104]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[105]  P. Legrain,et al.  A genomic approach of the hepatitis C virus generates a protein interaction map. , 2000, Gene.

[106]  M. Saraste,et al.  FEBS Lett , 2000 .

[107]  I. Abrantes,et al.  Protein variability among Portuguese and other populations of Globodera rostochiensis revealed by two-dimensional gel electrophoresis with computed image analysis , 2000 .

[108]  J. Gauthier,et al.  Meloidogyne chitwoodi and M. fallax protein variation assessed by two-dimensional electrophoregram computed analysis , 1999 .

[109]  B Herbert,et al.  Advances in protein solubilisation for two‐dimensional electrophoresis , 1999, Electrophoresis.

[110]  R D Appel,et al.  Protein identification and analysis tools in the ExPASy server. , 1999, Methods in molecular biology.

[111]  A Bairoch,et al.  Multiple parameter cross‐species protein identification using MultiIdent ‐ a world‐wide web accessible tool , 1998, Electrophoresis.

[112]  R. Quatrano Genomics , 1998, Plant Cell.

[113]  Denis F. Hochstrasser,et al.  Proteome in Perspective , 1998, Clinical chemistry and laboratory medicine.

[114]  Christian Scheler,et al.  Peptide mass fingerprint sequence coverage from differently stained proteins on two‐dimensional electrophoresis patterns by matrix assisted laser desorption/ionization‐mass spectrometry (MALDI‐MS) , 1998, Electrophoresis.

[115]  Mckusick Va Genomics: structural and functional studies of genomes. , 1997 .

[116]  M. Wilkins,et al.  Cross-species protein identification using amino acid composition, peptide mass fingerprinting, isoelectric point and molecular mass: a theoretical evaluation. , 1997, Journal of theoretical biology.

[117]  P. Roepstorff,et al.  Mass spectrometry in protein studies from genome to function. , 1997, Current opinion in biotechnology.

[118]  Marc R. Wilkins,et al.  Protein Identification in Proteome Projects , 1997 .

[119]  P. Lemkin Comparing two‐dimensional electrophoretic gel images across the Internet , 1997, Electrophoresis.

[120]  D. Hochstrasser,et al.  Progress with proteome projects: why all proteins expressed by a genome should be identified and how to do it. , 1996, Biotechnology & genetic engineering reviews.

[121]  M. Wilm,et al.  Error-tolerant identification of peptides in sequence databases by peptide sequence tags. , 1994, Analytical chemistry.

[122]  N. Bailey,et al.  Biometrie: Modelisation de Phenomenes Biologiques. , 1994 .

[123]  L. Zeng,et al.  A combined classical genetic and high resolution two-dimensional electrophoretic approach to the assessment of the number of genes affecting hybrid male sterility in Drosophila simulans and Drosophila sechellia. , 1993, Genetics.

[124]  P. Højrup,et al.  Rapid identification of proteins by peptide-mass fingerprinting , 1993, Current Biology.

[125]  AC Tose Cell , 1993, Cell.

[126]  M. Coulthart,et al.  A comprehensive study of genic variation in natural populations of Drosophila melanogaster. VI. Patterns and processes of genic divergence between D. melanogaster and its sibling species, Drosophila simulans. , 1992, Genetics.

[127]  Ronald C. Beavis,et al.  Matrix-assisted laser desorption/ionization mass spectrometry of biopolymers. , 1991, Analytical chemistry.

[128]  J R Scherrer,et al.  The MELANIE project: From a biopsy to automatic protein map interpretation by computer , 1991, Electrophoresis.

[129]  C. G. Edmonds,et al.  New developments in biochemical mass spectrometry: electrospray ionization. , 1990, Analytical chemistry.

[130]  R. Zinovieva,et al.  Squid major lens polypeptides are homologous to glutathione S-transferases subunits , 1988, Nature.

[131]  Y. Umezawa,et al.  Enhancement of uphill transport by a double carrier membrane system. , 1988, Analytical chemistry.

[132]  M. Karas,et al.  Laser desorption ionization of proteins with molecular masses exceeding 10,000 daltons. , 1988, Analytical chemistry.

[133]  D. Hochstrasser,et al.  Automatic classification of two‐dimensional gel electrophoresis pictures by heuristic clustering analysis: A step toward machine learning , 1988, Electrophoresis.

[134]  S. O’Brien,et al.  A molecular phylogeny of the hominoid primates as indicated by two-dimensional protein electrophoresis. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[135]  T. Lang,et al.  Specific protein modifications are altered in a temperature-sensitive Drosophila developmental mutant. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[136]  A. Leigh Brown,et al.  Estimation of genetic variability in natural populations of Drosophila simulans by two-dimensional and starch gel electrophoresis. , 1982, Genetics.

[137]  N G Anderson,et al.  The TYCHO system for computer analysis of two-dimensional gel electrophoresis patterns. , 1981, Clinical chemistry.

[138]  Y. Sakoyama,et al.  Two-dimensional gel patterns of protein species during development of Drosophila embryos. , 1981, Developmental biology.

[139]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[140]  P. O’Farrell High resolution two-dimensional electrophoresis of proteins. , 1975, The Journal of biological chemistry.

[141]  P. Edman,et al.  A protein sequenator. , 1967, European journal of biochemistry.

[142]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[143]  高橋 秀俊 双対性をめぐって-6-双対性とは何か , 1965 .

[144]  R. Robinson Phytochemistry , 1962, Nature.

[145]  Bernard S. Wostmann,et al.  Panel of referees , 2007 .

[146]  K. Pearson Biometrika , 1902, The American Naturalist.