Predicting functional upstream open reading frames in Saccharomyces cerevisiae

BackgroundSome upstream open reading frames (uORFs) regulate gene expression (i.e., they are functional) and can play key roles in keeping organisms healthy. However, how uORFs are involved in gene regulation is not yet fully understood. In order to get a complete view of how uORFs are involved in gene regulation, it is expected that a large number of experimentally verified functional uORFs are needed. Unfortunately, wet-experiments to verify that uORFs are functional are expensive.ResultsIn this paper, a new computational approach to predicting functional uORFs in the yeast Saccharomyces cerevisiae is presented. Our approach is based on inductive logic programming and makes use of a novel combination of knowledge about biological conservation, Gene Ontology annotations and genes' responses to different conditions. Our method results in a set of simple and informative hypotheses with an estimated sensitivity of 76%. The hypotheses predict 301 further genes to have 398 novel functional uORFs. Three (RPC11, TPK1, and FOL1) of these 301 genes have been hypothesised, following wet-experiments, by a related study to have functional uORFs. A comparison with another related study suggests that eleven of the predicted functional uORFs from genes LDB17, HEM3, CIN8, BCK2, PMC1, FAS1, APP1, ACC1, CKA2, SUR1, and ATH1 are strong candidates for wet-lab experimental studies.ConclusionsLearning based prediction of functional uORFs can be done with a high sensitivity. The predictions made in this study can serve as a list of candidates for subsequent wet-lab verification and might help to elucidate the regulatory roles of uORFs.

[1]  B. Birren,et al.  Sequencing and comparison of yeast species to identify genes and regulatory elements , 2003, Nature.

[2]  M. Kozak Possible role of flanking nucleotides in recognition of the AUG initiator codon by eukaryotic ribosomes. , 1981, Nucleic acids research.

[3]  H J Edenberg,et al.  Posttranscriptional regulation of human ADH5/FDH and Myf6 gene expression by upstream AUG codons. , 2001, Archives of biochemistry and biophysics.

[4]  R. Skoda,et al.  An activating splice donor mutation in the thrombopoietin gene causes hereditary thrombocythaemia , 1998, Nature Genetics.

[5]  A E Willis,et al.  Translational control of growth factor and proto-oncogene expression. , 1999, The international journal of biochemistry & cell biology.

[6]  Ibrahim Emam,et al.  ArrayExpress update—from an archive of functional genomics experiments to the atlas of gene expression , 2008, Nucleic Acids Res..

[7]  Thomas Preiss,et al.  Homodirectional changes in transcriptome composition and mRNA translation induced by rapamycin and heat shock , 2003, Nature Structural Biology.

[8]  M. Kozak,et al.  Regulation of translation via mRNA structure in prokaryotes and eukaryotes. , 2005, Gene.

[9]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[10]  Ivan Bratko,et al.  Applications of inductive logic programming , 1995, CACM.

[11]  F Sherman,et al.  mRNA structures influencing translation in the yeast Saccharomyces cerevisiae , 1988, Molecular and cellular biology.

[12]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[13]  Tu Bao Ho,et al.  Using Inductive Logic Programming for Predicting Protein-Protein Interactions from Multiple Genomic Data , 2005, PKDD.

[14]  Stephen Muggleton,et al.  Learning from Positive Data , 1996, Inductive Logic Programming Workshop.

[15]  Stephen Muggleton,et al.  Scientific knowledge discovery using inductive logic programming , 1999, Commun. ACM.

[16]  Fatima Sanchez-Cabo,et al.  Global Gene Expression Profiling Reveals Widespread yet Distinctive Translational Responses to Different Eukaryotic Translation Initiation Factor 2B-Targeting Stress Pathways , 2005, Molecular and Cellular Biology.

[17]  Christopher H. Bryant,et al.  A First Step towards Learning which uORFs Regulate Gene Expression , 2006, J. Integr. Bioinform..

[18]  Liviu Badea,et al.  Functional Discrimination of Gene Expression Patterns in Terms of the Gene Ontology , 2002, Pacific Symposium on Biocomputing.

[19]  F. Dietrich,et al.  Identification and characterization of upstream open reading frames (uORF) in the 5′ untranslated regions (UTR) of genes in Saccharomyces cerevisiae , 2005, Current Genetics.

[20]  J. Galagan,et al.  Dual modes of natural selection on upstream open reading frames. , 2007, Molecular biology and evolution.

[21]  D. Morris,et al.  Upstream Open Reading Frames as Regulators of mRNA Translation , 2000, Molecular and Cellular Biology.

[22]  E. Wilkinson Cancer Research UK , 2002 .

[23]  M. Hattori,et al.  A large-scale full-length cDNA analysis to explore the budding yeast transcriptome , 2006, Proceedings of the National Academy of Sciences.

[24]  Mark L Crowe,et al.  Evidence for conservation and selection of upstream open reading frames suggests probable encoding of bioactive peptides , 2006, BMC Genomics.

[25]  Julian N. Selley,et al.  Global Translational Responses to Oxidative Stress Impact upon Multiple Levels of Protein Synthesis* , 2006, Journal of Biological Chemistry.

[26]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[27]  David Botstein,et al.  SGD: Saccharomyces Genome Database , 1998, Nucleic Acids Res..

[28]  Terry Speed,et al.  Normalization of cDNA microarray data. , 2003, Methods.

[29]  C. Ball,et al.  Saccharomyces Genome Database. , 2002, Methods in enzymology.

[30]  M. Kozak Initiation of translation in prokaryotes and eukaryotes. , 1999, Gene.

[31]  C. Gissi,et al.  Structural and functional features of eukaryotic mRNA untranslated regions. , 2001, Gene.

[32]  M. Kozak,et al.  An analysis of vertebrate mRNA sequences: intimations of translational control , 1991, The Journal of cell biology.

[33]  Nada Lavrac,et al.  Learning Relational Descriptions of Differentially Expressed Gene Groups , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[34]  G. Rödel,et al.  AUG codons in the RNA leader sequences of the yeast PET genes CBS1 and SCO1 have no influence on translation efficiency , 1991, Current Genetics.

[35]  G. Ramponi,et al.  The 5′‐untranslated region of the human muscle acylphosphatase mRNA has an inhibitory effect on protein expression , 1997, FEBS letters.

[36]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[37]  Saso Dzroski,et al.  Relational data mining applications: an overview , 2001 .

[38]  L. Fulton,et al.  Finding Functional Features in Saccharomyces Genomes by Phylogenetic Footprinting , 2003, Science.

[39]  Wolfgang Huber,et al.  A high-resolution map of transcription in the yeast genome. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Amanda Clare,et al.  Predicting gene function in Saccharomyces cerevisiae , 2003, ECCB.

[41]  Ashwin Srinivasan Four suggestions and a rule concerning the application of ILP , 2001 .

[42]  Graziano Pesole,et al.  uAUG and uORFs in human and rodent 5'untranslated mRNAs. , 2005, Gene.

[43]  Kara Dolinski,et al.  Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) , 2002, Nucleic Acids Res..

[44]  C. Rodrigues-Pousada,et al.  Post‐termination ribosome interactions with the 5′UTR modulate yeast mRNA stability , 1999, The EMBO journal.

[45]  Saso Dzeroski,et al.  Inductive Logic Programming: Techniques and Applications , 1993 .

[46]  Marija Cvijovic,et al.  Identification of putative regulatory upstream ORFs in the yeast genome using heuristics and evolutionary conservation , 2007, BMC Bioinform..

[47]  Jürg Bähler,et al.  Post-transcriptional control of gene expression: a genome-wide perspective. , 2005, Trends in biochemical sciences.

[48]  J. McCarthy,et al.  Regulation of fungal gene expression via short open reading frames in the mRNA 5′untranslated region , 2003, Molecular microbiology.