Statistical methods for building random transposon mutagenesis libraries.

During the construction of random transposon mutagenesis libraries, four essential statistical issues arise: (1) Computing basic probability results for number of open reading frame knockouts. (2) Estimating the number of new open reading frames that will be knockouts in the next set of clones. (3) Estimating the number of essential open reading frames. (4) Computing the probability that an open reading frame is essential given the distribution of insertions. This chapter examines these issues and evaluates potential solutions using three different approaches: Efron and Thisted's estimator, Will and Jacobs's parametric bootstrap, and Blades and Broman's Gibbs sampler. In doing so, this chapter provides guidance for using the R statistical project to solve these problems.

[1]  J. W. Campbell,et al.  Experimental Determination and System Level Analysis of Essential Genes in Escherichia coli MG1655 , 2003, Journal of bacteriology.

[2]  Finbarr Hayes,et al.  Transposon-based strategies for microbial functional genomics and proteomics. , 2003, Annual review of genetics.

[3]  O. White,et al.  Global transposon mutagenesis and a minimal Mycoplasma genome. , 1999, Science.

[4]  S. Ehrlich,et al.  Essential Bacillus subtilis genes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[5]  B. Efron,et al.  Estimating the number of unseen species: How many words did Shakespeare know? Biometrika 63 , 1976 .

[6]  C. Fraser A genomics-based approach to biodefence preparedness , 2004, Nature Reviews Genetics.

[7]  O. Will,et al.  Estimating the Number of Essential Genes in Random Transposon Mutagenesis Libraries , 2006, q-bio/0608005.

[8]  Eric Haugen,et al.  Comprehensive transposon mutant library of Pseudomonas aeruginosa , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Karl W. Broman,et al.  A postgenomic method for predicting essential genes at subsaturation levels of mutagenesis: Application to Mycobacterium tuberculosis , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Andreas Krause,et al.  The basics of S-Plus , 2002 .

[11]  Ronald W. Davis,et al.  Functional profiling of the Saccharomyces cerevisiae genome , 2002, Nature.

[12]  N. L. Johnson,et al.  Urn models and their application : an approach to modern discrete probability theory , 1978 .