MakeHub: Fully Automated Generation of UCSC Genome Browser Assembly Hubs

Novel genomes are today often annotated by small consortia or individuals whose background is not from bioinformatics. This audience requires tools that are easy to use. This need had been addressed by several genome annotation tools and pipelines. Visualizing resulting annotation is a crucial step of quality control. The UCSC Genome Browser is a powerful and popular genome visualization tool. Assembly Hubs allow browsing genomes that are hosted locally via already available UCSC Genome Browser servers. The steps for creating custom Assembly Hubs are well documented and the required tools are publicly available. However, the number of steps for creating a novel Assembly Hub is large. In some cases the format of input files needs to be adapted which is a difficult task for scientists without programming background. Here, we describe the novel command line tool MakeHub that generates Assembly Hubs for the UCSC Genome Browser in a fully automated fashion. The pipeline also allows extending previously created Hubs by additional tracks. MakeHub is freely available for download from https://github.com/Gaius-Augustus/MakeHub. Contact katharina.hoff@uni-greifswald.de

[1]  Ian Korf,et al.  Gene finding in novel genomes , 2004, BMC Bioinformatics.

[2]  Katharina J. Hoff,et al.  Current methods for automated annotation of protein-coding genes. , 2015, Current opinion in insect science.

[3]  Lincoln Stein,et al.  Using GBrowse 2.0 to visualize and share next-generation sequence data , 2013, Briefings Bioinform..

[4]  J. Keilwagen,et al.  GeMoMa: Homology-Based Gene Prediction Utilizing Intron Position Conservation and RNA-seq Data. , 2019, Methods in molecular biology.

[5]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[6]  Jens Keilwagen,et al.  Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi , 2017, BMC Bioinformatics.

[7]  Steven Salzberg,et al.  TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders , 2004, Bioinform..

[8]  Katharina J. Hoff,et al.  BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS , 2016, Bioinform..

[9]  Galt P. Barber,et al.  BigWig and BigBed: enabling browsing of large distributed datasets , 2010, Bioinform..

[10]  Jeremy Goecks,et al.  G-OnRamp: a Galaxy-based platform for collaborative annotation of eukaryotic genomes , 2019, Bioinform..

[11]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[12]  Burkhard Morgenstern,et al.  AUGUSTUS: a web server for gene finding in eukaryotes , 2004, Nucleic Acids Res..

[13]  Mario Stanke,et al.  Predicting Genes in Single Genomes with AUGUSTUS , 2018, Current protocols in bioinformatics.

[14]  Jeremy Goecks,et al.  G-OnRamp: A Galaxy-based platform for creating genome browsers for collaborative genome annotation , 2018, bioRxiv.

[15]  Katharina J. Hoff,et al.  WebAUGUSTUS—a web service for training AUGUSTUS and predicting genes in eukaryotes , 2013, Nucleic Acids Res..

[16]  M. Borodovsky,et al.  Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm , 2014, Nucleic acids research.

[17]  Jens Keilwagen,et al.  Using intron position conservation for homology-based gene prediction , 2016, Nucleic acids research.

[18]  Ting Wang,et al.  Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser , 2013, Bioinform..

[19]  M. Borodovsky,et al.  Gene identification in novel eukaryotic genomes by self-training algorithm , 2005, Nucleic acids research.

[20]  Burkhard Morgenstern,et al.  Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources , 2006, BMC Bioinformatics.

[21]  David Haussler,et al.  Using native and syntenically mapped cDNA alignments to improve de novo gene finding , 2008, Bioinform..

[22]  Mark Yandell,et al.  MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects , 2011, BMC Bioinformatics.

[23]  Burkhard Morgenstern,et al.  AUGUSTUS: ab initio prediction of alternative transcripts , 2006, Nucleic Acids Res..

[24]  Mario Stanke,et al.  Whole-Genome Annotation with BRAKER. , 2019, Methods in molecular biology.

[25]  J. Keilwagen,et al.  Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi , 2017, bioRxiv.

[26]  L. Stein,et al.  JBrowse: a next-generation genome browser. , 2009, Genome research.

[27]  Mario Stanke,et al.  Simultaneous gene finding in multiple genomes , 2016, Bioinform..

[28]  M. Borodovsky,et al.  Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. , 2008, Genome research.