Bayesian analysis of gene essentiality based on sequencing of transposon insertion libraries

MOTIVATION Next-generation sequencing affords an efficient analysis of transposon insertion libraries, which can be used to identify essential genes in bacteria. To analyse this high-resolution data, we present a formal Bayesian framework for estimating the posterior probability of essentiality for each gene, using the extreme-value distribution to characterize the statistical significance of the longest region lacking insertions within a gene. We describe a sampling procedure based on the Metropolis-Hastings algorithm to calculate posterior probabilities of essentiality while simultaneously integrating over unknown internal parameters. RESULTS Using a sequence dataset from a transposon library for Mycobacterium tuberculosis, we show that this Bayesian approach predicts essential genes that correspond well with genes shown to be essential in previous studies. Furthermore, we show that by using the extreme-value distribution to characterize genomic regions lacking transposon insertions, this method is capable of identifying essential domains within genes. This approach can be used for analysing transposon libraries in other organisms and augmenting essentiality predictions with statistical confidence scores.

[1]  Joel S. Freundlich,et al.  Supplemental Information Structure-Guided Discovery of Phenyl-diketo Acids as Potent Inhibitors of M . tuberculosis Malate Synthase , 2012 .

[2]  Eduardo Abeliuk,et al.  The essential genome of a bacterium , 2011, Molecular systems biology.

[3]  Sayera Banu,et al.  Are the PE‐PGRS proteins of Mycobacterium tuberculosis variable surface antigens? , 2002, Molecular microbiology.

[4]  B. Barrell,et al.  Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence , 1998, Nature.

[5]  K. Broman,et al.  Estimating the number of essential genes in a genome by random transposon mutagenesis , 2002 .

[6]  W. Bishai,et al.  Designer Arrays for Defined Mutant Analysis To Detect Genes Essential for Survival of Mycobacterium tuberculosis in Mouse Lungs , 2005, Infection and Immunity.

[7]  Aldert L. Zomer,et al.  ESSENTIALS: Software for Rapid Analysis of High Throughput Transposon Insertion Sequencing Data , 2012, PloS one.

[8]  D. Schnappinger,et al.  The Mycobacterium tuberculosis β-oxidation genes echA5 and fadB3 are dispensable for growth in vitro and in vivo , 2011, Tuberculosis.

[9]  Ranjeet Singh,et al.  Phosphorylation of PhoP Protein Plays Direct Regulatory Role in Lipid Biosynthesis of Mycobacterium tuberculosis* , 2011, The Journal of Biological Chemistry.

[10]  T. Parish,et al.  The role of GlnD in ammonia assimilation in Mycobacterium tuberculosis , 2007, Tuberculosis.

[11]  Samiul Hasan,et al.  Prioritizing Genomic Drug Targets in Pathogens: Application to Mycobacterium tuberculosis , 2006, PLoS Comput. Biol..

[12]  G. Kaplan,et al.  The resuscitation-promoting factors of Mycobacterium tuberculosis are required for virulence and resuscitation from dormancy but are collectively dispensable for growth in vitro , 2007, Molecular microbiology.

[13]  Thomas R. Ioerger,et al.  Global Assessment of Genomic Regions Required for Growth in Mycobacterium tuberculosis , 2012, PLoS pathogens.

[14]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[15]  Stephen H. Bryant,et al.  Domain size distributions can predict domain boundaries , 2000, Bioinform..

[16]  Thomas R. Ioerger,et al.  High-Resolution Phenotypic Profiling Defines Genes Essential for Mycobacterial Growth and Cholesterol Catabolism , 2011, PLoS pathogens.

[17]  C. Gee,et al.  A Phosphorylated Pseudokinase Complex Controls Cell Wall Synthesis in Mycobacteria , 2012, Science Signaling.

[18]  Karl W. Broman,et al.  A postgenomic method for predicting essential genes at subsaturation levels of mutagenesis: Application to Mycobacterium tuberculosis , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[19]  J. Mekalanos,et al.  In vivo transposition of mariner-based elements in enteric bacteria and mycobacteria. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Mark Schilling,et al.  The Longest Run of Heads , 1990 .

[21]  Michael Hecker,et al.  The progress made in determining the Mycobacterium tuberculosis structural proteome , 2011, Proteomics.

[22]  S. Haydel,et al.  The prrAB Two-Component System Is Essential for Mycobacterium tuberculosis Viability and Is Induced under Nitrogen-Limiting Conditions , 2011, Journal of bacteriology.

[23]  M. Churchill,et al.  A purified mariner transposase is sufficient to mediate transposition in vitro , 1996, The EMBO journal.

[24]  J. Mekalanos,et al.  Systematic identification of essential genes by in vitro mariner mutagenesis. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Christopher M. Sassetti,et al.  Genetic requirements for mycobacterial survival during infection , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[26]  James C. Sacchettini,et al.  Persistence of Mycobacterium tuberculosis in macrophages and mice requires the glyoxylate shunt enzyme isocitrate lyase , 2000, Nature.

[27]  D Botstein,et al.  Functional Analysis of the Genes of Yeast Chromosome V by Genetic Footprinting , 1996, Science.

[28]  Matthias Wilmanns,et al.  The progress made in determining the Mycobacterium tuberculosis structural proteome , 2011, Proteomics.

[29]  Kenneth Rice,et al.  FDR and Bayesian Multiple Comparisons Rules , 2006 .

[30]  Georgia Giannoukos,et al.  Tracking insertion mutants within libraries by deep sequencing and a genome-wide screen for Haemophilus genes required in the lung , 2009, Proceedings of the National Academy of Sciences.

[31]  Robert D. Finn,et al.  The Pfam protein families database , 2004, Nucleic Acids Res..

[32]  Leopold Parts,et al.  Simultaneous assay of every Salmonella Typhi gene using one million transposon mutants. , 2009, Genome research.

[33]  J. W. Campbell,et al.  Experimental Determination and System Level Analysis of Essential Genes in Escherichia coli MG1655 , 2003, Journal of bacteriology.

[34]  K. E. Pullen,et al.  An alternate conformation and a third metal in PstP/Ppp, the M. tuberculosis PP2C-Family Ser/Thr protein phosphatase. , 2004, Structure.

[35]  Finbarr Hayes,et al.  Transposon-based strategies for microbial functional genomics and proteomics. , 2003, Annual review of genetics.

[36]  E. Rubin,et al.  Genes required for mycobacterial growth defined by high density mutagenesis , 2003, Molecular microbiology.

[37]  R. C. Fahey,et al.  The mshA gene encoding the glycosyltransferase of mycothiol biosynthesis is essential in Mycobacterium tuberculosis Erdman. , 2006, FEMS microbiology letters.