Seqping: gene prediction pipeline for plant genomes using self-training gene models and transcriptomic data

[1]  Katharina J. Hoff,et al.  BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS , 2016, Bioinform..

[2]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[3]  James K. Hane,et al.  CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts , 2015, BMC Genomics.

[4]  Shweta Mehrotra,et al.  Repetitive Sequences in Plant Nuclear DNA: Types, Distribution, Evolution and Function , 2014, Genom. Proteom. Bioinform..

[5]  Rick L. Stevens,et al.  High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource , 2014, Proceedings of the National Academy of Sciences.

[6]  Bernhard Y. Renard,et al.  GIIRA - RNA-Seq driven gene finding incorporating ambiguous reads , 2014, Bioinform..

[7]  Gregory Butler,et al.  SnowyOwl: accurate prediction of fungal genes by using RNA-Seq and homology information to select among ab initio models , 2014, BMC Bioinformatics.

[8]  Carolyn J. Lawrence-Dill,et al.  MAKER-P: A Tool Kit for the Rapid Creation, Management, and Quality Control of Plant Genome Annotations1[W][OPEN] , 2013, Plant Physiology.

[9]  J. Harrow,et al.  Assessment of transcript reconstruction methods for RNA-seq , 2013, Nature Methods.

[10]  Neelam Goel,et al.  A comparative analysis of soft computing techniques for gene prediction. , 2013, Analytical biochemistry.

[11]  Gordon Gremme,et al.  GenomeTools: A Comprehensive Software Library for Efficient Processing of Structured Genome Annotations , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  S. Eddy,et al.  Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions , 2013, Nucleic acids research.

[13]  Neelam Goel,et al.  A Review of Soft Computing Techniques for Gene Prediction , 2013 .

[14]  T. Tatarinova,et al.  Evaluation of Codon Biology in Citrus and Poncirus trifoliata Based on Genomic Features and Frame Corrected Expressed Sequence Tags , 2013, DNA research : an international journal for rapid publication of reports on genes and genomes.

[15]  The UniProt Consortium,et al.  Update on activities at the Universal Protein Resource (UniProt) in 2013 , 2012, Nucleic Acids Res..

[16]  D. Schwartz,et al.  Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data , 2013, Rice.

[17]  Paul J. Kennedy,et al.  Evaluating High-Throughput Ab Initio Gene Finders to Discover Proteins Encoded in Eukaryotic Pathogen Genomes Missed by Laboratory Techniques , 2012, PloS one.

[18]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[19]  Volker Brendel,et al.  ParsEval: parallel comparison and analysis of gene structure annotations , 2012, BMC Bioinformatics.

[20]  Tanya Z. Berardini,et al.  The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools , 2011, Nucleic Acids Res..

[21]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy , 2011, Nucleic Acids Res..

[22]  Mark Yandell,et al.  MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects , 2011, BMC Bioinformatics.

[23]  Kui Lin,et al.  RNA-Seq improves annotation of protein-coding genes in the cucumber genome , 2011, BMC Genomics.

[24]  José M. Sempere,et al.  The Gypsy Database (GyDB) of mobile genetic elements: release 2.0 , 2010, Nucleic Acids Res..

[25]  Sean R. Eddy,et al.  Hidden Markov model speed heuristic and iterative HMM search procedure , 2010, BMC Bioinformatics.

[26]  Roy D. Sleator,et al.  An overview of the current status of eukaryote gene prediction strategies. , 2010, Gene.

[27]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[28]  Cheng Soon Ong,et al.  mGene: accurate SVM-based gene finding with an application to nematode genomes. , 2009, Genome research.

[29]  Nickolai N Alexandrov,et al.  Genome-wide discovery of cis-elements in promoter sequences using gene expression. , 2009, Omics : a journal of integrative biology.

[30]  M. Borodovsky,et al.  Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. , 2008, Genome research.

[31]  J. Bouck,et al.  Insights into corn genes derived from large-scale cDNA sequencing , 2008, Plant Molecular Biology.

[32]  Alexander Souvorov,et al.  Splign: algorithms for computing spliced alignments with identification of paralogs , 2008, Biology Direct.

[33]  David Haussler,et al.  Using native and syntenically mapped cDNA alignments to improve de novo gene finding , 2008, Bioinform..

[34]  Sofia M. C. Robb,et al.  MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. , 2007, Genome research.

[35]  J. Galagan,et al.  Conrad: gene prediction using conditional random fields. , 2007, Genome research.

[36]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[37]  V. Solovyev,et al.  Automatic annotation of eukaryotic genes, pseudogenes and promoters , 2006, Genome Biology.

[38]  Jonathan E. Allen,et al.  JIGSAW, GeneZilla, and GlimmerHMM: puzzling out the features of human genes in the ENCODE regions , 2006, Genome Biology.

[39]  Burkhard Morgenstern,et al.  Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources , 2006, BMC Bioinformatics.

[40]  M. Borodovsky,et al.  Gene identification in novel eukaryotic genomes by self-training algorithm , 2005, Nucleic acids research.

[41]  Steven Salzberg,et al.  JIGSAW: integration of multiple sources of evidence for gene prediction , 2005, Bioinform..

[42]  J. Jurka,et al.  Repbase Update, a database of eukaryotic repetitive elements , 2005, Cytogenetic and Genome Research.

[43]  N. Alexandrov,et al.  Features of Arabidopsis Genes and Genome Discovered using Full-length cDNAs , 2005, Plant Molecular Biology.

[44]  Steven Salzberg,et al.  TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders , 2004, Bioinform..

[45]  Ian Korf,et al.  Gene finding in novel genomes , 2004, BMC Bioinformatics.

[46]  C. Robin Buell,et al.  The TIGR Plant Repeat Databases: a collective resource for the identification of repetitive sequences in plants , 2004, Nucleic Acids Res..

[47]  Mario Stanke,et al.  Gene prediction with a hidden Markov model and a new intron submodel , 2003, ECCB.

[48]  Nickolai Alexandrov,et al.  Skew in CG content near the transcription start site in Arabidopsis thaliana , 2003, ISMB.

[49]  W. D’Haeze Faster DNA sequencing , 2002, Genome Biology.

[50]  B. Haas,et al.  Full-length messenger RNA sequences greatly improve genome annotation , 2002, Genome Biology.

[51]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[52]  V. Solovyev,et al.  Ab initio gene finding in Drosophila genomic DNA. , 2000, Genome research.

[53]  Edward C. Uberbacher,et al.  GRAIL: a multi-agent neural network system for gene identification , 1996, Proc. IEEE.

[54]  R. Guigó,et al.  Evaluation of gene structure prediction programs. , 1996, Genomics.

[55]  E. Snyder,et al.  Identification of protein coding regions in genomic DNA. , 1995, Journal of molecular biology.

[56]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.