FINDER: an automated software package to annotate eukaryotic genes from RNA-Seq data and associated protein sequences

Background Gene annotation in eukaryotes is a non-trivial task that requires meticulous analysis of accumulated transcript data. Challenges include transcriptionally active regions of the genome that contain overlapping genes, genes that produce numerous transcripts, transposable elements and numerous diverse sequence repeats. Currently available gene annotation software applications depend on pre-constructed full-length gene sequence assemblies which are not guaranteed to be error-free. The origins of these sequences are often uncertain, making it difficult to identify and rectify errors in them. This hinders the creation of an accurate and holistic representation of the transcriptomic landscape across multiple tissue types and experimental conditions. Therefore, to gauge the extent of diversity in gene structures, a comprehensive analysis of genome-wide expression data is imperative. Results We present FINDER, a fully automated computational tool that optimizes the entire process of annotating genes and transcript structures. Unlike current state-of-the-art pipelines, FINDER automates the RNA-Seq pre-processing step by working directly with raw sequence reads and optimizes gene prediction from BRAKER2 by supplementing these reads with associated proteins. The FINDER pipeline (1) reports transcripts and recognizes genes that are expressed under specific conditions, (2) generates all possible alternatively spliced transcripts from expressed RNA-Seq data, (3) analyzes read coverage patterns to modify existing transcript models and create new ones, and (4) scores genes as high- or low-confidence based on the available evidence across multiple datasets. We demonstrate the ability of FINDER to automatically annotate a diverse pool of genomes from eight species. Conclusions FINDER takes a completely automated approach to annotate genes directly from raw expression data. It is capable of processing eukaryotic genomes of all sizes and requires no manual supervision—ideal for bench researchers with limited experience in handling computational tools.

[1]  Roger P Wise,et al.  Next-generation yeast-two-hybrid analysis with Y2H-SCORES identifies novel interactors of the MLA immune receptor , 2021, PLoS Comput. Biol..

[2]  Weihui Xu,et al.  Disruption of barley immunity to powdery mildew by an in-frame Lys-Leu deletion in the essential protein SGT1. , 2020, Genetics.

[3]  Roger P. Wise,et al.  NGPINT: A Next-generation protein-protein interaction software , 2020, bioRxiv.

[4]  K. Dorman,et al.  Y2H-SCORES: A statistical framework to infer protein-protein interactions from next-generation yeast-two-hybrid sequence data , 2020, bioRxiv.

[5]  Mario Stanke,et al.  BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database , 2020, bioRxiv.

[6]  Tzahi Y. Cath,et al.  Fault isolation for a complex decentralized waste water treatment facility , 2020, Journal of the Royal Statistical Society: Series C (Applied Statistics).

[7]  K. Lindblad-Toh,et al.  A new long-read dog assembly uncovers thousands of exons and functional elements missing in the previous reference , 2020, bioRxiv.

[8]  Jacob D. Wickham,et al.  A Reference Genome of Bursaphelenchus mucronatus Provides New Resources for Revealing Its Displacement by Pinewood Nematode , 2020, Genes.

[9]  Xianyang Zhang,et al.  Leveraging biological and statistical covariates improves the detection power in epigenome-wide association testing , 2020, Genome Biology.

[10]  K. Pedley,et al.  De novo transcriptome of Phakopsora pachyrhizi uncovers putative effector repertoire during infection , 2020 .

[11]  Songnian Hu,et al.  The genome evolution and domestication of tropical fruit mango , 2020, Genome Biology.

[12]  B. Lang,et al.  The draft nuclear genome sequence and predicted mitochondrial proteome of Andalucia godoyi, a protist with the most gene-rich and bacteria-like mitochondrial genome , 2020, BMC Biology.

[13]  R. Murphy,et al.  The genome of Shaw's sea snake (Hydrophis curtus) reveals secondary adaptation to its marine environment. , 2020, Molecular biology and evolution.

[14]  Y. van de Peer,et al.  The hornwort genome and early land plant evolution , 2020, Nature Plants.

[15]  Jana Sperschneider,et al.  Machine learning in plant-pathogen interactions: empowering biological predictions from field scale to genome scale. , 2020, The New phytologist.

[16]  G. Azzam,et al.  Genome-wide identification and characterization of long intergenic noncoding RNAs in the regenerative flatworm Macrostomum lignano. , 2020, Genomics.

[17]  Ricardo J. Miragaia,et al.  scRNA-seq assessment of the human lung, spleen, and esophagus tissue stability after cold preservation , 2019, Genome Biology.

[18]  Liliana Florea,et al.  A multi-sample approach increases the accuracy of transcript assembly , 2019, Nature Communications.

[19]  Xifeng Fan,et al.  PacBio single-molecule long-read sequencing shed new light on the complexity of the Carex breviculmis transcriptome , 2019, BMC Genomics.

[20]  Xuchu Wang Protein and Proteome Atlas for Plants under Stresses: New Highlights and Ways for Integrated Omics in Post-Genomics Era , 2019, International journal of molecular sciences.

[21]  Zheng Xu,et al.  Using Machine Learning and Gene Nonhomology Features to Predict Gene Ontology , 2019, bioRxiv.

[22]  Zheng Xu,et al.  Non-Homology-Based Prediction of Gene Functions , 2019 .

[23]  D. Nettleton,et al.  Small RNA discovery in the interaction between barley and the powdery mildew pathogen , 2019, BMC Genomics.

[24]  Geo Pertea,et al.  Transcriptome assembly from long-read RNA-seq alignments with StringTie2 , 2019, Genome Biology.

[25]  Carlo Zaniolo,et al.  Multifaceted protein–protein interaction prediction based on Siamese residual RCNN , 2019, Bioinform..

[26]  A. Kozaki,et al.  Atypical Splicing Accompanied by Skipping Conserved Micro-Exons Produces Unique WRINKLED1, An AP2 Domain Transcription Factor in Rice Plants , 2019, Plants.

[27]  Identification and Analysis of Micro-Exon Genes in the Rice Genome , 2019, International journal of molecular sciences.

[28]  M. Bayer,et al.  BaRTv1.0: an improved barley reference transcript dataset to determine accurate changes in the barley transcriptome using RNA-seq , 2019, BMC Genomics.

[29]  Steven L Salzberg,et al.  Next-generation genome annotation: we still struggle to get it right , 2019, Genome Biology.

[30]  K. Mayer,et al.  TRITEX: chromosome-scale sequence assembly of Triticeae genomes with open-source tools , 2019, Genome Biology.

[31]  W Brad Barbazuk,et al.  Spaceflight-induced alternative splicing during seedling development in Arabidopsis thaliana , 2019, npj Microgravity.

[32]  B. Blencowe,et al.  A novel protein domain in an ancestral splicing factor drove the evolution of neural microexons , 2019, Nature Ecology & Evolution.

[33]  T. Bilde,et al.  DNA Methylation Patterns in the Social Spider, Stegodyphus dumicola , 2019, Genes.

[34]  F. Speleman,et al.  Long noncoding RNA expression profiling in cancer: Challenges and opportunities , 2019, Genes, chromosomes & cancer.

[35]  Yuan Zhou,et al.  Critical assessment and performance improvement of plant‐pathogen protein‐protein interaction prediction methods , 2019, Briefings Bioinform..

[36]  Xing-Ming Zhao,et al.  DeepPhos: prediction of protein phosphorylation sites with deep learning , 2019, Bioinform..

[37]  A. Liston,et al.  A draft genome and transcriptome of common milkweed (Asclepias syriaca) as resources for evolutionary, ecological, and molecular studies in milkweeds and Apocynaceae , 2019, PeerJ.

[38]  J. Keilwagen,et al.  GeMoMa: Homology-Based Gene Prediction Utilizing Intron Position Conservation and RNA-seq Data. , 2019, Methods in molecular biology.

[39]  Mario Stanke,et al.  Whole-Genome Annotation with BRAKER. , 2019, Methods in molecular biology.

[40]  Song Li,et al.  Identification of Novel lincRNA and Co-Expression Network Analysis Using RNA-Sequencing Data in Plants. , 2019, Methods in molecular biology.

[41]  J. K. Mandal,et al.  Information Systems Design and Intelligent Applications , 2019, Advances in Intelligent Systems and Computing.

[42]  G. Curigliano,et al.  Complexity of genome sequencing and reporting: Next generation sequencing (NGS) technologies and implementation of precision medicine in real life. , 2019, Critical reviews in oncology/hematology.

[43]  Tianfu Wang,et al.  Condition-specific gene co-expression network mining identifies key pathways and regulators in the brain tissue of Alzheimer’s disease patients , 2018, BMC Medical Genomics.

[44]  John A. Calarco,et al.  Genome-wide CRISPR-Cas9 Interrogation of Splicing Networks Reveals a Mechanism for Recognition of Autism-Misregulated Neuronal Microexons. , 2018, Molecular cell.

[45]  David Sankoff,et al.  Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L. , 2018, Nature Genetics.

[46]  M. P. Douglas,et al.  The Global Market for Next-Generation Sequencing Tests Continues Its Torrid Pace. , 2018, The Journal of precision medicine.

[47]  G. Azzam,et al.  Transcriptome profiles and novel lncRNA identification of Aedes aegypti cells in response to dengue virus serotype 1 , 2018 .

[48]  Jonathan D. G. Jones,et al.  Shifting the limits in wheat research and breeding using a fully annotated reference genome , 2018, Science.

[49]  D. Geschwind,et al.  Autism-like phenotype and risk gene-RNA deadenylation by CPEB4 mis-splicing , 2018, Nature.

[50]  Eve Syrkin Wurtele,et al.  phylostratr: A framework for phylostratigraphy , 2018, bioRxiv.

[51]  Jiangning Song,et al.  Quokka: a comprehensive tool for rapid and accurate prediction of kinase family‐specific phosphorylation sites in the human proteome , 2018, Bioinform..

[52]  Aviv Regev,et al.  Comprehensive comparative analysis of 5’ end RNA sequencing methods , 2018, Nature Methods.

[53]  S. Richards,et al.  Full disclosure: Genome assembly is still hard , 2018, PLoS biology.

[54]  Laurent Bouri,et al.  Ten steps to get started in Genome Assembly and Annotation [version 1; referees: 2 approved] , 2019 .

[55]  Eve Syrkin Wurtele,et al.  Raising orphans from a metadata morass: A researcher's guide to re-use of public 'omics data. , 2018, Plant science : an international journal of experimental plant biology.

[56]  Arul M. Chinnaiyan,et al.  Cancer transcriptome profiling at the juncture of clinical translation , 2017, Nature Reviews Genetics.

[57]  Takaki Maekawa,et al.  Signatures of host specialization and a recent transposable element burst in the dynamic one-speed genome of the fungal barley powdery mildew pathogen , 2018, BMC Genomics.

[58]  J. D. Mills,et al.  Microexons: novel regulators of the transcriptome , 2018 .

[59]  Bo Wang,et al.  Gramene 2018: unifying comparative genomics and pathway resources for plant research , 2017, Nucleic Acids Res..

[60]  Pritish Kumar Varadwaj,et al.  DeepInteract: Deep Neural Network Based Protein-Protein Interaction Prediction Tool , 2017 .

[61]  Shabhonam Caim,et al.  Leveraging multiple transcriptome assembly methods for improved gene structure annotation , 2017, bioRxiv.

[62]  Junwen Wang,et al.  UClncR: Ultrafast and comprehensive long non-coding RNA detection from RNA-seq , 2017, Scientific Reports.

[63]  Carl Kingsford,et al.  Accurate assembly of transcripts through phase-preserving graph decomposition , 2017, Nature Biotechnology.

[64]  Jennifer M. Taylor,et al.  ApoplastP: prediction of effectors and plant proteins in the apoplast using machine learning , 2017, bioRxiv.

[65]  Martin Vingron,et al.  Haplotype-resolved sweet potato genome traces back its hexaploidization history , 2017, Nature Plants.

[66]  Charles Wang,et al.  Mosquito-Borne Diseases and Omics: Tissue-Restricted Expression and Alternative Splicing Revealed by Transcriptome Profiling of Anopheles stephensi. , 2017, Omics : a journal of integrative biology.

[67]  Geoffrey I. Webb,et al.  PhosphoPredict: A bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selection , 2017, Scientific Reports.

[68]  Lucian Ilie,et al.  SPRINT: ultrafast protein-protein interaction prediction of the entire human interactome , 2017, BMC Bioinformatics.

[69]  Sebastien M. Weyn-Vanhentenryck,et al.  Microexons: discovery, regulation, and function , 2017, Wiley interdisciplinary reviews. RNA.

[70]  Leah Clissold,et al.  Uncovering hidden variation in polyploid wheat , 2017, Proceedings of the National Academy of Sciences.

[71]  Kevin L. Schneider,et al.  Improved maize reference genome with single-molecule technologies , 2017, Nature.

[72]  Jian Wang,et al.  Photonic crystal nanocavity assisted rejection ratio tunable notch microwave photonic filter , 2017, Scientific Reports.

[73]  G. Pazour,et al.  Ror2 signaling regulates Golgi structure and transport through IFT20 for tumor invasiveness , 2017, Scientific Reports.

[74]  C. Pieterse,et al.  Architecture and dynamics of the jasmonic acid gene regulatory network , 2016, bioRxiv.

[75]  Robert P. Davey,et al.  An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations , 2016, bioRxiv.

[76]  Zhizhai Liu,et al.  Breeding for Drought Tolerance in Maize (Zea mays L.) , 2016 .

[77]  Tyson A. Clark,et al.  Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing , 2016, Nature Communications.

[78]  F. Thibaud-Nissen,et al.  Araport11: a complete reannotation of the Arabidopsis thaliana reference genome , 2016, bioRxiv.

[79]  Rolf Backofen,et al.  Global RNA recognition patterns of post‐transcriptional regulators Hfq and CsrA revealed by UV crosslinking in vivo , 2016, The EMBO journal.

[80]  Jana Sperschneider,et al.  EffectorP: predicting fungal effector proteins from secretomes using machine learning. , 2016, The New phytologist.

[81]  Julie A. Dickerson,et al.  Strawberry: Fast and accurate genome-guided transcript reconstruction and quantification from RNA-Seq , 2016, bioRxiv.

[82]  M. Harry,et al.  Comparing de novo and reference-based transcriptome assembly strategies by applying them to the blood-sucking bug Rhodnius prolixus. , 2016, Insect biochemistry and molecular biology.

[83]  Jerzy K. Kulski,et al.  Next-Generation Sequencing — An Overview of the History, Tools, and “Omic” Applications , 2016 .

[84]  Katharina J. Hoff,et al.  BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS , 2016, Bioinform..

[85]  V. Ranzani,et al.  Analysis RNA-seq and Noncoding RNA. , 2016, Methods in molecular biology.

[86]  Chon-Kit Kenneth Chan,et al.  Analysis of RNA-Seq Data Using TopHat and Cufflinks. , 2016, Methods in molecular biology.

[87]  Chun-Ming Liu,et al.  A single-nucleotide exon found in Arabidopsis , 2015, Scientific Reports.

[88]  Sagnik Banerjee,et al.  Improvement of protein disorder prediction by brainstorming consensus , 2015, 2015 International Conference and Workshop on Computing and Communication (IEMCON).

[89]  Subhadip Basu,et al.  PhospredRF: Prediction of protein phosphorylation sites using a consensus of random forest classifiers , 2015, 2015 International Conference and Workshop on Computing and Communication (IEMCON).

[90]  Sagnik Banerjee,et al.  Identification of relevant physico chemical properties of amino acids with respect to protein glycosylation prediction , 2015, 2015 International Conference and Workshop on Computing and Communication (IEMCON).

[91]  Sagnik Banerjee,et al.  Improving protein protein interaction prediction by choosing appropriate physiochemical properties of amino acids , 2015, 2015 International Conference and Workshop on Computing and Communication (IEMCON).

[92]  Dmitri D. Pervouchine,et al.  The human transcriptome across tissues and individuals , 2015, Science.

[93]  M. Lorieux,et al.  Whole Genome Sequencing of Elite Rice Cultivars as a Comprehensive Information Resource for Marker Assisted Selection , 2015, PloS one.

[94]  Lei Wang,et al.  ALDB: A Domestic-Animal Long Noncoding RNA Database , 2015, PloS one.

[95]  S. Salzberg,et al.  StringTie enables improved reconstruction of a transcriptome from RNA-seq reads , 2015, Nature Biotechnology.

[96]  N. Grishin,et al.  Insights into the Evolution of Longevity from the Bowhead Whale Genome , 2015, Cell reports.

[97]  Mita Nasipuri,et al.  JUPred_SVM: Prediction of Phosphorylation Sites Using a Consensus of SVM Classifiers , 2015, SocProS.

[98]  Subhadip Basu,et al.  Big Data Analytics and Its Prospects in Computational Proteomics , 2015 .

[99]  Chun-Ming Liu A single-nucleotide exon found in , 2015 .

[100]  Subhadip Basu,et al.  JUPred_MLP: Prediction of Phosphorylation Sites Using a Consensus of MLP Classifiers , 2015, FICTA.

[101]  Robert J. Weatheritt,et al.  A Highly Conserved Program of Neuronal Microexons Is Misregulated in Autistic Brains , 2014, Cell.

[102]  M. Yandell,et al.  Genome Annotation and Curation Using MAKER and MAKER‐P , 2014, Current protocols in bioinformatics.

[103]  Eve Syrkin Wurtele,et al.  Coming of age: orphan genes in plants. , 2014, Trends in plant science.

[104]  Alexandre Lomsadze,et al.  Identification of protein coding regions in RNA transcripts , 2014, BCB.

[105]  Pierre Baldi,et al.  SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity , 2014, Bioinform..

[106]  A. Quinlan BEDTools: The Swiss‐Army Tool for Genome Feature Analysis , 2014, Current protocols in bioinformatics.

[107]  M. Borodovsky,et al.  Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm , 2014, Nucleic acids research.

[108]  Idris A. Eckley,et al.  changepoint: An R Package for Changepoint Analysis , 2014 .

[109]  Jun Wang,et al.  The 3,000 rice genomes project: new opportunities and challenges for future rice research , 2014, GigaScience.

[110]  V. S. Rao,et al.  Protein-Protein Interaction Detection: Methods and Analysis , 2014, International journal of proteomics.

[111]  Melissa J. Landrum,et al.  RefSeq: an update on mammalian reference sequences , 2013, Nucleic Acids Res..

[112]  Xun Xu,et al.  SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads , 2013, Bioinform..

[113]  Carolyn J. Lawrence-Dill,et al.  MAKER-P: A Tool Kit for the Rapid Creation, Management, and Quality Control of Plant Genome Annotations1[W][OPEN] , 2013, Plant Physiology.

[114]  J. Logan,et al.  The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system , 2013, Proceedings of the National Academy of Sciences.

[115]  J. Harrow,et al.  Systematic evaluation of spliced alignment programs for RNA-seq data , 2013, Nature Methods.

[116]  J. Harrow,et al.  Assessment of transcript reconstruction methods for RNA-seq , 2013, Nature Methods.

[117]  T. Gingeras,et al.  RAMPAGE: Promoter Activity Profiling by Paired‐End Sequencing of 5′‐Complete cDNAs , 2013, Current protocols in molecular biology.

[118]  R. Wilson,et al.  The Next-Generation Sequencing Revolution and Its Impact on Genomics , 2013, Cell.

[119]  L. Hood,et al.  The Human Genome Project: big science transforms biology and medicine , 2013, Genome Medicine.

[120]  Shaojun Xie,et al.  Genome-wide annotation of genes and noncoding RNAs of foxtail millet in response to simulated drought stress by deep sequencing , 2013, Plant Molecular Biology.

[121]  Michael Q. Zhang,et al.  OLego: fast and sensitive mapping of spliced mRNA-Seq reads using small seeds , 2013, Nucleic acids research.

[122]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[123]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..

[124]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[125]  Mark Yandell,et al.  MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects , 2011, BMC Bioinformatics.

[126]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.

[127]  N. Friedman,et al.  Trinity : reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2016 .

[128]  Hideaki Sugawara,et al.  The Sequence Read Archive , 2010, Nucleic Acids Res..

[129]  Leo Goodstadt,et al.  Ruffus: a lightweight Python library for computational pipelines , 2010, Bioinform..

[130]  C. Bustamante,et al.  Genomic Diversity and Introgression in O. sativa Reveal the Impact of Domestication and Breeding on the Rice Genome , 2010, PloS one.

[131]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[132]  Xianghuo He,et al.  Multiple microRNAs modulate p21Cip1/Waf1 expression by directly targeting its 3′ untranslated region , 2010, Oncogene.

[133]  Birgit Eisenhaber,et al.  Prediction of posttranslational modification of proteins from their amino acid sequence. , 2010, Methods in molecular biology.

[134]  Masashi Sugiyama,et al.  Change-Point Detection in Time-Series Data by Direct Density-Ratio Estimation , 2009, SDM.

[135]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[136]  Cheng Soon Ong,et al.  mGene: accurate SVM-based gene finding with an application to nematode genomes. , 2009, Genome research.

[137]  Lan Jin,et al.  Biological basis for restriction of microRNA targets to the 3' untranslated region in mammalian mRNAs. , 2009, Nature structural & molecular biology.

[138]  Karen Eilbeck,et al.  Quantitative measures for the management and comparison of annotated genomes , 2009, BMC Bioinformatics.

[139]  M. Borodovsky,et al.  Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. , 2008, Genome research.

[140]  F. Slack,et al.  A SNP in a let-7 microRNA complementary site in the KRAS 3' untranslated region increases non-small cell lung cancer risk. , 2008, Cancer research.

[141]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[142]  O. Mühlemann,et al.  Posttranscriptional Gene Regulation by Spatial Rearrangement of the 3′ Untranslated Region , 2008, PLoS biology.

[143]  David Haussler,et al.  Using native and syntenically mapped cDNA alignments to improve de novo gene finding , 2008, Bioinform..

[144]  Sofia M. C. Robb,et al.  MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. , 2007, Genome research.

[145]  Melanie A. Huntley,et al.  Evolution of genes and genomes on the Drosophila phylogeny , 2007, Nature.

[146]  Yoshinobu Kawahara,et al.  Change-Point Detection in Time-Series Data Based on Subspace Identification , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[147]  A. J. Conner,et al.  Intron-rich gene structure in the intracellular plant parasite Plasmodiophora brassicae. , 2007, Protist.

[148]  R. Lund,et al.  Changepoint Detection in Periodic and Autocorrelated Time Series , 2007 .

[149]  Hongjoong Kim,et al.  A novel approach to detection of intrusions in computer networks via adaptive sequential and batch-sequential change-point detection methods , 2006, IEEE Transactions on Signal Processing.

[150]  V. Solovyev,et al.  Automatic annotation of eukaryotic genes, pseudogenes and promoters , 2006, Genome Biology.

[151]  Yu Xue,et al.  MeMo: a web tool for prediction of protein methylation modifications , 2006, Nucleic Acids Res..

[152]  R. Wise,et al.  Upstream open reading frames of the barley Mla13 powdery mildew resistance gene function co-operatively to down-regulate translation. , 2006, Molecular plant pathology.

[153]  Kenji Yamanishi,et al.  A unifying framework for detecting outliers and change points from time series , 2006, IEEE Transactions on Knowledge and Data Engineering.

[154]  R. Myers,et al.  Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome. , 2005, Genome research.

[155]  Janet M Thornton,et al.  Protein function prediction using local 3D templates. , 2005, Journal of molecular biology.

[156]  Samuel S. Gross,et al.  Begin at the beginning: predicting genes with 5' UTRs. , 2005, Genome research.

[157]  D. Duvick The Contribution of Breeding to Yield Advances in maize (Zea mays L.) , 2005 .

[158]  Ewan Birney,et al.  Automated generation of heuristics for biological sequence comparison , 2005, BMC Bioinformatics.

[159]  Ian Korf,et al.  Gene finding in novel genomes , 2004, BMC Bioinformatics.

[160]  Yanda Li,et al.  The impact of very short alternative splicing on protein structures and functions in the human genome. , 2004, Trends in genetics : TIG.

[161]  Meena Kishore Sakharkar,et al.  Distributions of exons and introns in the human genome , 2004, Silico Biol..

[162]  H. Agrama,et al.  Mapping QTLs in breeding for drought tolerance in maize (Zea mays L.) , 2004, Euphytica.

[163]  J. Kawai,et al.  Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[164]  Stephen M. Mount,et al.  Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. , 2003, Nucleic acids research.

[165]  Qunfeng Dong,et al.  GeneSeqer add PlantGDB: gene structure prediction in plant genomes , 2003, Nucleic Acids Res..

[166]  F. Wei,et al.  Powdery Mildew-Induced Mla mRNAs Are Alternatively Spliced and Contain Multiple Upstream Open Reading Frames1 , 2003, Plant Physiology.

[167]  G. Rubin,et al.  Computational analysis of core promoters in the Drosophila genome , 2002, Genome Biology.

[168]  Xudong Huang,et al.  An Iron-responsive Element Type II in the 5′-Untranslated Region of the Alzheimer's Amyloid Precursor Protein Transcript* , 2002, The Journal of Biological Chemistry.

[169]  H. Meijer,et al.  Control of eukaryotic protein synthesis by upstream open reading frames in the 5'-untranslated region of an mRNA. , 2002, The Biochemical journal.

[170]  Ikuo Inoue,et al.  A common polymorphism in the 5'-untranslated region of the VEGF gene is associated with diabetic retinopathy in type 2 diabetes. , 2002, Diabetes.

[171]  B. Madras,et al.  Polymorphisms in the 3′-untranslated region of human and monkey dopamine transporter genes affect reporter gene expression , 2002, Molecular Psychiatry.

[172]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[173]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[174]  Béatrice Conne,et al.  The 3′ untranslated region of messenger RNA: A molecular ‘hotspot’ for pathology? , 2000, Nature Medicine.

[175]  Thomas M. McIntyre,et al.  Post-transcriptional Control of Cyclooxygenase-2 Gene Expression , 2000, The Journal of Biological Chemistry.

[176]  Liam J. McGuffin,et al.  The PSIPRED protein structure prediction server , 2000, Bioinform..

[177]  Andrew Smith Genome sequence of the nematode C-elegans: A platform for investigating biology , 1998 .

[178]  J. Berg Genome sequence of the nematode C. elegans: a platform for investigating biology. , 1998, Science.

[179]  Klaus Hermann,et al.  GeneGenerator - a flexible algorithm for gene prediction and its application to maize sequences , 1998, Bioinform..

[180]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[181]  R. Jackson,et al.  Do the poly(A) tail and 3′ untranslated region control mRNA translation? , 1990, Cell.

[182]  R. Yogev,et al.  Primary Peritonitis Associated with Hemophilus influenzae Bacteremia in a Normal Child , 1983, Clinical pediatrics.

[183]  R. Longhurst Agricultural production and food consumption: some neglected linkages. , 1983, Food and nutrition.

[184]  P. Krakowka,et al.  Determination of para-amino-salicylic acid in the saliva as check-up of treatment with this drug. , 1966, Polish medical journal.