PGAP: pan-genomes analysis pipeline

Summary: With the rapid development of DNA sequencing technology, increasing bacteria genome data enable the biologists to dig the evolutionary and genetic information of prokaryotic species from pan-genome sight. Therefore, the high-efficiency pipelines for pan-genome analysis are mostly needed. We have developed a new pan-genome analysis pipeline (PGAP), which can perform five analytic functions with only one command, including cluster analysis of functional genes, pan-genome profile analysis, genetic variation analysis of functional genes, species evolution analysis and function enrichment analysis of gene clusters. PGAP's performance has been evaluated on 11 Streptococcus pyogenes strains. Availability:PGAP is developed with Perl script on the Linux Platform and the package is freely available from http://pgap.sf.net. Contact: junyu@big.ac.cn; xiaojingfa@big.ac.cn Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  P. Gajer,et al.  The Pangenome Structure of Escherichia coli: Comparative Genomic Analysis of E. coli Commensal and Pathogenic Isolates , 2008, Journal of bacteriology.

[2]  Yongxiang Zhang,et al.  Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions , 2010, BMC Bioinformatics.

[3]  Lu Wang,et al.  The NIH Human Microbiome Project. , 2009, Genome research.

[4]  J. Wain,et al.  High-throughput sequencing provides insights into genome variation and evolution in Salmonella Typhi , 2008, Nature Genetics.

[5]  Justin S. Hogg,et al.  Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains , 2007, Genome Biology.

[6]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[7]  Jaideep P. Sundaram,et al.  Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome". , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Laura Serino,et al.  Genome-based approaches to develop vaccines against bacterial pathogens. , 2009, Vaccine.

[9]  Andrés Moya,et al.  Legionella pneumophila pangenome reveals strain-specific virulence factors , 2010, BMC Genomics.

[10]  Sacha A. F. T. van Hijum,et al.  PanCGHweb: a web tool for genotype calling in pangenome CGH data , 2010, Bioinform..

[11]  Evan Powell,et al.  Comparative Genomic Analyses of Seventeen Streptococcus pneumoniae Strains: Insights into the Pneumococcal Supragenome , 2007, Journal of bacteriology.

[12]  Christine Fong,et al.  Bioinformatics Applications Note Genome Analysis Pgat: a Multistrain Analysis Resource for Microbial Genomes , 2022 .

[13]  Mark J. Pallen,et al.  Bacterial pathogenomics , 2007, Nature.

[14]  M. Stanhope,et al.  Evolution of the core and pan-genome of Streptococcus: positive selection, recombination, and genome composition , 2007, Genome Biology.