PanWeb: A web interface for pan-genomic analysis

With increased production of genomic data since the advent of next-generation sequencing (NGS), there has been a need to develop new bioinformatics tools and areas, such as comparative genomics. In comparative genomics, the genetic material of an organism is directly compared to that of another organism to better understand biological species. Moreover, the exponentially growing number of deposited prokaryote genomes has enabled the investigation of several genomic characteristics that are intrinsic to certain species. Thus, a new approach to comparative genomics, termed pan-genomics, was developed. In pan-genomics, various organisms of the same species or genus are compared. Currently, there are many tools that can perform pan-genomic analyses, such as PGAP (Pan-Genome Analysis Pipeline), Panseq (Pan-Genome Sequence Analysis Program) and PGAT (Prokaryotic Genome Analysis Tool). Among these software tools, PGAP was developed in the Perl scripting language and its reliance on UNIX platform terminals and its requirement for an extensive parameterized command line can become a problem for users without previous computational knowledge. Thus, the aim of this study was to develop a web application, known as PanWeb, that serves as a graphical interface for PGAP. In addition, using the output files of the PGAP pipeline, the application generates graphics using custom-developed scripts in the R programming language. PanWeb is freely available at http://www.computationalbiology.ufpa.br/panweb.

[1]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[2]  Yongxiang Zhang,et al.  Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions , 2010, BMC Bioinformatics.

[3]  Ewan Birney,et al.  Genome annotation techniques: new approaches and challenges. , 2002, Drug discovery today.

[4]  Jun Yu,et al.  PGAP: pan-genomes analysis pipeline , 2011, Bioinform..

[5]  Piramanayagam Shanmughavel,et al.  Comparative genomics - A perspective , 2007, Bioinformation.

[6]  M. Blaxter,et al.  Genome-wide genetic marker discovery and genotyping using next-generation sequencing , 2011, Nature Reviews Genetics.

[7]  David R. Riley,et al.  Comparative genomics: the bacterial pan-genome. , 2008, Current opinion in microbiology.

[8]  Christine Fong,et al.  Bioinformatics Applications Note Genome Analysis Pgat: a Multistrain Analysis Resource for Microbial Genomes , 2022 .

[9]  Lin Liu,et al.  Comparison of Next-Generation Sequencing Systems , 2012, Journal of biomedicine & biotechnology.

[10]  Stijn van Dongen,et al.  Graph Clustering Via a Discrete Uncoupling Process , 2008, SIAM J. Matrix Anal. Appl..

[11]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[12]  Chao Xie,et al.  Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.

[13]  Chien-Chi Lo,et al.  Pathogen comparative genomics in the next-generation sequencing era: genome alignments, pangenomics and metagenomics. , 2011, Briefings in functional genomics.

[14]  Gang Liu,et al.  Automatic clustering of orthologs and inparalogs shared by multiple proteomes , 2006, ISMB.

[15]  Jaideep P. Sundaram,et al.  Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome". , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Jun Yu,et al.  A Brief Review of Software Tools for Pangenomics , 2015, Genom. Proteom. Bioinform..

[17]  M Gribskov,et al.  A systematic analysis of human disease-associated gene sequences in Drosophila melanogaster. , 2001, Genome research.

[18]  D. Raoult,et al.  Massive comparative genomic analysis reveals convergent evolution of specialized bacteria , 2009, Biology Direct.