Integrative bioinformatics for functional genome annotation: trawling for G protein-coupled receptors.

G protein-coupled receptors (GPCR) are amongst the best studied and most functionally diverse types of cell-surface protein. The importance of GPCRs as mediates or cell function and organismal developmental underlies their involvement in key physiological roles and their prominence as targets for pharmacological therapeutics. In this review, we highlight the requirement for integrated protocols which underline the different perspectives offered by different sequence analysis methods. BLAST and FastA offer broad brush strokes. Motif-based search methods add the fine detail. Structural modelling offers another perspective which allows us to elucidate the physicochemical properties that underlie ligand binding. Together, these different views provide a more informative and a more detailed picture of GPCR structure and function. Many GPCRs remain orphan receptors with no identified ligand, yet as computer-driven functional genomics starts to elaborate their functions, a new understanding of their roles in cell and developmental biology will follow.

[1]  M. Nei,et al.  Evolution of olfactory receptor genes in the human genome , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[2]  K. Palczewski,et al.  Crystal Structure of Rhodopsin: A G‐Protein‐Coupled Receptor , 2002, Chembiochem : a European journal of chemical biology.

[3]  E. Pebay-Peyroula,et al.  X-ray structure of bacteriorhodopsin at 2.5 angstroms from microcrystals grown in lipidic cubic phases. , 1997, Science.

[4]  J. Bockaert,et al.  Molecular tinkering of G protein‐coupled receptors: an evolutionary success , 1999, The EMBO journal.

[5]  H. Matter,et al.  Structural classification of protein kinases using 3D molecular interaction field analysis of their ligand binding sites: target family landscapes. , 2002, Journal of medicinal chemistry.

[6]  Peer Bork,et al.  SMART 4.0: towards genomic data integration , 2004, Nucleic Acids Res..

[7]  Francine B. Perler,et al.  InBase: the Intein Database , 2002, Nucleic Acids Res..

[8]  G. Klebe,et al.  Molecular similarity indices in a comparative analysis (CoMSIA) of drug molecules to correlate and predict their biological activity. , 1994, Journal of medicinal chemistry.

[9]  David Haussler,et al.  Classifying G-protein coupled receptors with support vector machines , 2002, Bioinform..

[10]  D R Flower,et al.  Modelling G-protein-coupled receptors for drug design. , 1999, Biochimica et biophysica acta.

[11]  Amos Bairoch,et al.  Recent improvements to the PROSITE database , 2004, Nucleic Acids Res..

[12]  A. Methner,et al.  Phylogenetic analysis of 277 human G-protein-coupled receptors as a tool for the prediction of orphan receptor ligands , 2002, Genome Biology.

[13]  D. Lipman,et al.  Rapid and sensitive protein similarity searches. , 1985, Science.

[14]  Terri K. Attwood,et al.  BLAST PRINTS - alternative perspectives on sequence similarity , 1999, Bioinform..

[15]  Terri K. Attwood,et al.  PRINTS-S: the database formerly known as PRINTS , 2000, Nucleic Acids Res..

[16]  Qing Zhou,et al.  AsMamDB: an alternative splice database of mammals , 2001, Nucleic Acids Res..

[17]  R. Lefkowitz The superfamily of heptahelical receptors , 2000, Nature Cell Biology.

[18]  David E. Gloriam,et al.  Seven evolutionarily conserved human rhodopsin G protein‐coupled receptors lacking close relatives , 2003, FEBS letters.

[19]  Kay Hofmann,et al.  Protein classification and functional assignment , 1998 .

[20]  T K Attwood,et al.  Fingerprinting G-protein-coupled receptors. , 1994, Protein engineering.

[21]  T. Attwood,et al.  PRINTS--a protein motif fingerprint database. , 1994, Protein engineering.

[22]  Alex Bateman,et al.  The InterPro database, an integrated documentation resource for protein families, domains and functional sites , 2001, Nucleic Acids Res..

[23]  Owen White,et al.  The TIGRFAMs database of protein families , 2003, Nucleic Acids Res..

[24]  D C Teller,et al.  Advances in determination of a high-resolution three-dimensional structure of rhodopsin, a model of G-protein-coupled receptors (GPCRs). , 2001, Biochemistry.

[25]  A. Michie,et al.  CINEMA--a novel colour INteractive editor for multiple alignments. , 1998, Gene.

[26]  T K Attwood,et al.  A compendium of specific motifs for diagnosing GPCR subtypes. , 2001, Trends in pharmacological sciences.

[27]  R. Henderson,et al.  Model for the structure of bacteriorhodopsin based on high-resolution electron cryo-microscopy. , 1990, Journal of molecular biology.

[28]  Shmuel Pietrokovski,et al.  Increased coverage of protein families with the Blocks Database servers , 2000, Nucleic Acids Res..

[29]  Alex Bateman,et al.  The InterPro Database, 2003 brings increased coverage and new features , 2003, Nucleic Acids Res..

[30]  A. Bairoch PROSITE: a dictionary of sites and patterns in proteins. , 1991, Nucleic acids research.

[31]  A Bairoch,et al.  Protein annotation: detective work for function prediction. , 1998, Trends in genetics : TIG.

[32]  P. Goodford A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. , 1985, Journal of medicinal chemistry.

[33]  Didier Rognan,et al.  Protein‐based virtual screening of chemical databases. II. Are homology models of g‐protein coupled receptors suitable targets? , 2002, Proteins.

[34]  J. Gutkind,et al.  G-protein-coupled receptors and signaling networks: emerging paradigms. , 2001, Trends in pharmacological sciences.

[35]  Terri K. Attwood,et al.  PRINTS and its automatic supplement, prePRINTS , 2003, Nucleic Acids Res..

[36]  Terri K. Attwood,et al.  FingerPRINTScan: intelligent searching of the PRINTS motif database , 1999, Bioinform..

[37]  R. Wade,et al.  Classification of protein sequences by homology modeling and quantitative analysis of electrostatic similarity , 1999, Proteins.

[38]  Walter R. Gilks,et al.  Modeling the percolation of annotation errors in a database of protein sequences , 2002, Bioinform..

[39]  H. Schiöth,et al.  The G-protein-coupled receptors in the human genome form five main families. Phylogenetic analysis, paralogon groups, and fingerprints. , 2003, Molecular pharmacology.

[40]  R. Lefkowitz,et al.  Heptahelical Receptor Signaling: Beyond the G Protein Paradigm , 1999, The Journal of cell biology.

[41]  R. Cramer,et al.  Recent advances in comparative molecular field analysis (CoMFA). , 1989, Progress in clinical and biological research.

[42]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[43]  Michal Linial,et al.  How incorrect annotations evolve--the case of short ORFs. , 2003, Trends in biotechnology.

[44]  Emmanuel Barillot,et al.  DBcat: a catalog of 500 biological databases , 2000, Nucleic Acids Res..

[45]  D. Bergsma,et al.  Orphan G protein-coupled receptors: a neglected opportunity for pioneer drug discovery. , 1997, Trends in pharmacological sciences.

[46]  T. Lundstedt,et al.  Classification of G‐protein coupled receptors by alignment‐independent extraction of principal chemical properties of primary amino acid sequences , 2002, Protein science : a publication of the Protein Society.