G-OnRamp: Generating genome browsers to facilitate undergraduate-driven collaborative genome annotation

Scientists are sequencing new genomes at an increasing rate with the goal of associating genome contents with phenotypic traits. After a new genome is sequenced and assembled, structural gene annotation is often the first step in analysis. Despite advances in computational gene prediction algorithms, most eukaryotic genomes still benefit from manual gene annotation. Undergraduates can become skilled annotators, and in the process learn both about genes/genomes and about how to utilize large datasets. Data visualizations provided by a genome browser are essential for manual gene annotation, enabling annotators to quickly evaluate multiple lines of evidence (e.g., sequence similarity, RNA-Seq, gene predictions, repeats). However, creating genome browsers requires extensive computational skills; lack of the expertise required remains a major barrier for many biomedical researchers and educators. To address these challenges, the Genomics Education Partnership (GEP; https://gep.wustl.edu/) has partnered with the Galaxy Project (https://galaxyproject.org) to develop G-OnRamp (http://g-onramp.org), a web-based platform for creating UCSC Assembly Hubs and JBrowse genome browsers. G-OnRamp can also convert a JBrowse instance into an Apollo instance for collaborative genome annotations in research and educational settings. G-OnRamp enables researchers to easily visualize their experimental results, educators to create Course-based Undergraduate Research Experiences (CUREs) centered on genome annotation, and students to participate in genomics research. Development of G-OnRamp was guided by extensive user feedback from in-person workshops. Sixty-five researchers and educators from over 40 institutions participated in these workshops, which produced over 20 genome browsers now available for research and education. For example, genome browsers for four parasitoid wasp species were used in a CURE engaging 142 students taught by 13 faculty members — producing a total of 192 gene models. G-OnRamp can be deployed on a personal computer or on cloud computing platforms, and the genome browsers produced can be transferred to the CyVerse Data Store for long-term access.

[1]  Lukas A. Mueller,et al.  A quick guide for student-driven community genome annotation , 2018, PLoS Comput. Biol..

[2]  Anthony Bretaudeau,et al.  GGA: Galaxy for genome annotation, teaching, and genomic databases , 2018 .

[3]  K. Anders,et al.  Scaling Up: Adapting a Phage-Hunting Course to Increase Participation of First-Year Students in Research , 2016, CBE life sciences education.

[4]  Jimmy Ma,et al.  Drosophila Muller F Elements Maintain a Distinct Set of Genomic Properties Over 40 Million Years of Evolution , 2015, G3: Genes, Genomes, Genetics.

[5]  Jeremy Buhler,et al.  A Course-Based Research Experience: How Benefits Change with Increased Investment in Instructional Time , 2014, CBE life sciences education.

[6]  David Lopatto,et al.  Undergraduate research experiences support science career decisions and active learning. , 2007, CBE life sciences education.

[7]  D Lopatto,et al.  Genomics Education Partnership , 2008, Science.

[8]  Sarah K. Hilton,et al.  Retrotransposons Are the Major Contributors to the Expansion of the Drosophila ananassae Muller F Element , 2017, G3: Genes|Genomes|Genetics.

[9]  Suzanna E Lewis,et al.  JBrowse: a dynamic web platform for genome visualization and analysis , 2016, Genome Biology.

[10]  Katharina J. Hoff,et al.  BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS , 2016, Bioinform..

[11]  Cathy H. Wu,et al.  Community annotation and bioinformatics workforce development in concert—Little Skate Genome Annotation Workshops and Jamborees , 2012, Database J. Biol. Databases Curation.

[12]  M. Stanke,et al.  Multi-Genome Annotation with AUGUSTUS. , 2019, Methods in molecular biology.

[13]  K. Brenner,et al.  Undergraduate Research Experiences for STEM Students: Successes, Challenges, and Opportunities. , 2017 .

[14]  Stephen Ficklin,et al.  Structural and Functional Annotation of Eukaryotic Genomes with GenSAS. , 2019, Methods in molecular biology.

[15]  M. Yandell,et al.  Genome Annotation and Curation Using MAKER and MAKER‐P , 2014, Current protocols in bioinformatics.

[16]  Pratibha Varma-Nelson,et al.  Assessment of Course-Based Undergraduate Research Experiences: A Meeting Report , 2014, CBE life sciences education.

[17]  Jennifer R Kowalski,et al.  Implementation of a Collaborative Series of Classroom-Based Undergraduate Research Experiences Spanning Chemical Biology, Biochemistry, and Neurobiology , 2016, CBE life sciences education.

[18]  Jeremy Buhler,et al.  The Genomics Education Partnership: Successful Integration of Research into Laboratory Classes at a Diverse Group of Undergraduate Institutions , 2010, CBE life sciences education.

[19]  Ting Wang,et al.  Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser , 2013, Bioinform..

[20]  Janet S Russell,et al.  The genome solver website: a virtual space fostering high impact practices for undergraduate biology. , 2012, Journal of microbiology & biology education.

[21]  Adam J. Kleinschmit,et al.  The GEP: Crowd-Sourcing Big Data Analysis with Undergraduates. , 2017, Trends in genetics : TIG.

[22]  J. Gouzy,et al.  EuGene: An Automated Integrative Gene Finder for Eukaryotes and Prokaryotes. , 2019, Methods in molecular biology.

[23]  Mary Lee S. Ledbetter,et al.  Vision and Change in Undergraduate Biology Education: A Call to Action Presentation to Faculty for Undergraduate Neuroscience, July 2011 , 2012, Journal of undergraduate neuroscience education : JUNE : a publication of FUN, Faculty for Undergraduate Neuroscience.

[24]  Wei Li,et al.  A Broadly Implementable Research Course in Phage Discovery and Genomics for First-Year Undergraduate Students , 2014, mBio.

[25]  Doreen Ware,et al.  The iPlant Collaborative: Cyberinfrastructure for Enabling Data to Discovery for the Life Sciences , 2016, PLoS biology.

[26]  Jeremy Goecks,et al.  G-OnRamp: a Galaxy-based platform for collaborative annotation of eukaryotic genomes , 2019, Bioinform..

[27]  Deborah Grove,et al.  Vision and Change through the Genome Consortium for Active Teaching Using Next-Generation Sequencing (GCAT-SEEK) , 2014, CBE life sciences education.

[28]  Erin L. Dolan,et al.  Early Engagement in Course-Based Research Increases Graduation Rates and Completion of Science, Engineering, and Mathematics Degrees , 2016, CBE life sciences education.

[29]  Colin Diesh,et al.  Apollo: Democratizing genome annotation , 2019, bioRxiv.