Reference-agnostic representation and visualization of pan-genomes

Background The pan-genome of a species is the union of the genes and non-coding sequences present in all individuals (cultivar, accessions, or strains) within that species. Results Here we introduce PGV, a reference-agnostic representation of the pan-genome of a species based on the notion of consensus ordering. Our experimental results demonstrate that PGV enables an intuitive, effective and interactive visualization of a pan-genome by providing a genome browser that can elucidate complex structural genomic variations. Conclusions The PGV software can be installed via conda or downloaded from https://github.com/ucrbioinfo/PGV . The companion PGV browser at http://pgv.cs.ucr.edu can be tested using example bed tracks available from the GitHub page.

[1]  Christophe Ambroise,et al.  PPanGGOLiN: Depicting microbial diversity via a partitioned pangenome graph , 2019, bioRxiv.

[2]  Yang You,et al.  PGAP-X: extension on pan-genome analysis pipeline , 2018, BMC Genomics.

[3]  Yan Li,et al.  Sequencing and de novo assembly of a near complete indica rice genome , 2017, Nature Communications.

[4]  Richard A Neher,et al.  panX: pan-genome analysis and exploration , 2016, bioRxiv.

[5]  S. Soares,et al.  Inside the Pan-genome - Methods and Software Overview , 2015, Current genomics.

[6]  Chao Di,et al.  U1 snRNP regulates cancer cell migration and invasion in vitro , 2020, Nature Communications.

[7]  K. Schneeberger,et al.  Chromosome-level assemblies of multiple Arabidopsis genomes reveal hotspots of rearrangements with altered evolutionary dynamics , 2020, Nature Communications.

[8]  H. Tettelin,et al.  The microbial pan-genome. , 2005, Current opinion in genetics & development.

[9]  Jesse Gillis,et al.  Is it time to change the reference genome? , 2019, bioRxiv.

[10]  The Computational Pan-Genomics Consortium,et al.  Computational pan-genomics: status, promises and challenges , 2018, Briefings Bioinform..

[11]  S. Shu,et al.  The genome of cowpea (Vigna unguiculata [L.] Walp.) , 2019, bioRxiv.

[12]  David Sankoff,et al.  Multichromosomal median and halving problems under different genomic distances , 2009, BMC Bioinformatics.

[13]  David Haussler,et al.  Building a Pan-Genome Reference for a Population , 2015, J. Comput. Biol..

[14]  Allan Veras,et al.  PanWeb: A web interface for pan-genomic analysis , 2017, PloS one.

[15]  S. Shu,et al.  The genome of cowpea (Vigna unguiculata [L.] Walp.). , 2019, The Plant journal : for cell and molecular biology.

[16]  Jun Yu,et al.  PGAP: pan-genomes analysis pipeline , 2011, Bioinform..

[17]  Andrew D. Farmer,et al.  Genome Context Viewer: visual exploration of multiple annotated genomes using microsynteny , 2017, Bioinform..

[18]  N. Perna,et al.  progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement , 2010, PloS one.

[19]  Yongxiang Zhang,et al.  Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions , 2010, BMC Bioinformatics.

[20]  Eduardo P C Rocha,et al.  PPanGGOLiN: Depicting microbial diversity via a partitioned pangenome graph , 2020, PLoS computational biology.

[21]  Jaideep P. Sundaram,et al.  Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome". , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[22]  A. Dobin,et al.  Is it time to change the reference genome? , 2019, Genome Biology.

[23]  Morihiro Hayashida,et al.  Finding Median and Center Strings for a Probability Distribution on a Set of Strings Under Levenshtein Distance Based on Integer Linear Programming , 2016, BIOSTEC.