Whole-Genome Comparison of Mycobacterium tuberculosis Clinical and Laboratory Strains

ABSTRACT Virulence and immunity are poorly understood in Mycobacterium tuberculosis. We sequenced the complete genome of the M. tuberculosis clinical strain CDC1551 and performed a whole-genome comparison with the laboratory strain H37Rv in order to identify polymorphic sequences with potential relevance to disease pathogenesis, immunity, and evolution. We found large-sequence and single-nucleotide polymorphisms in numerous genes. Polymorphic loci included a phospholipase C, a membrane lipoprotein, members of an adenylate cyclase gene family, and members of the PE/PPE gene family, some of which have been implicated in virulence or the host immune response. Several gene families, including the PE/PPE gene family, also had significantly higher synonymous and nonsynonymous substitution frequencies compared to the genome as a whole. We tested a large sample of M. tuberculosis clinical isolates for a subset of the large-sequence and single-nucleotide polymorphisms and found widespread genetic variability at many of these loci. We performed phylogenetic and epidemiological analysis to investigate the evolutionary relationships among isolates and the origins of specific polymorphic loci. A number of these polymorphisms appear to have occurred multiple times as independent events, suggesting that these changes may be under selective pressure. Together, these results demonstrate that polymorphisms among M. tuberculosis strains are more extensive than initially anticipated, and genetic variation may have an important role in disease pathogenesis and immunity.

[1]  T. Gingeras,et al.  Comparing genomes within the species Mycobacterium tuberculosis. , 2001, Genome research.

[2]  N. W. Davis,et al.  Genome sequence of enterohaemorrhagic Escherichia coli O157:H7 , 2001, Nature.

[3]  S. Reed,et al.  T Cell Expression Cloning of a Mycobacterium tuberculosis Gene Encoding a Protective Antigen Associated with the Early Control of Infection1 , 2000, The Journal of Immunology.

[4]  J. Betts,et al.  Comparison of the proteome of Mycobacterium tuberculosis strain H37Rv with clinical isolate CDC 1551. , 2000, Microbiology.

[5]  B. Robertson,et al.  Comparison of Mycobacterium Tuberculosis Genomes Reveals Frequent Deletions in a 20 kb Variable Region in Clinical Isolates , 2000, Yeast.

[6]  P. V. van Helden,et al.  Mapping of IS6110 flanking regions in clinical isolates of Mycobacterium tuberculosis demonstrates genome plasticity , 2000, Molecular microbiology.

[7]  N. Federspiel,et al.  Granuloma-specific expression of Mycobacterium virulence proteins from the glycine-rich PE-PGRS family. , 2000, Science.

[8]  J. Musser,et al.  Negligible genetic diversity of mycobacterium tuberculosis host immune system protein targets: evidence of limited selective pressure. , 2000, Genetics.

[9]  S. Salzberg,et al.  Genome sequences of Chlamydia trachomatis MoPn and Chlamydia pneumoniae AR39. , 2000, Nucleic acids research.

[10]  R. Alm,et al.  Analysis of the genetic diversity of Helicobacter pylori: the tale of two genomes , 1999, Journal of Molecular Medicine.

[11]  W. Bishai,et al.  Virulence of Mycobacterium tuberculosisCDC1551 and H37Rv in Rabbits Evaluated by Lurie’s Pulmonary Tubercle Count Method , 1999, Infection and Immunity.

[12]  P. Haslett,et al.  Mycobacterium tuberculosis CDC1551 induces a more vigorous host response in vivo and in vitro, but is not more virulent than other clinical isolates. , 1999, Journal of immunology.

[13]  G. Schoolnik,et al.  Comparative genomics of BCG vaccines by whole-genome DNA microarray. , 1999, Science.

[14]  S. Cole,et al.  Identification of variable regions in the genomes of tubercle bacilli using bacterial artificial chromosome arrays , 1999, Molecular microbiology.

[15]  B. Barrell,et al.  New insertion sequences and a novel repeated sequence in the genome of Mycobacterium tuberculosis H37Rv. , 1999, Microbiology.

[16]  P. Palittapongarnpim,et al.  IS6110-Mediated Deletions of Wild-Type Chromosomes of Mycobacterium tuberculosis , 1999, Journal of bacteriology.

[17]  Benjamin L. King,et al.  Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori , 1999, Nature.

[18]  S. Salzberg,et al.  Alignment of whole genomes. , 1999, Nucleic acids research.

[19]  B. Barrell,et al.  Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence , 1998, Nature.

[20]  S. Gillespie,et al.  Nonrandom Association of IS6110 andMycobacterium tuberculosis: Implications for Molecular Epidemiological Studies , 1998, Journal of Clinical Microbiology.

[21]  I. Onorato,et al.  An outbreak involving extensive transmission of a virulent strain of Mycobacterium tuberculosis. , 1998, The New England journal of medicine.

[22]  J. Musser,et al.  Molecular genetic basis of antimicrobial agent resistance in Mycobacterium tuberculosis: 1998 update. , 1998, Tubercle and lung disease : the official journal of the International Union against Tuberculosis and Lung Disease.

[23]  S. Salzberg,et al.  Microbial gene identification using interpolated Markov models. , 1998, Nucleic acids research.

[24]  T. Whittam,et al.  Evolutionary genetics of the isocitrate dehydrogenase gene (icd) in Escherichia coli and Salmonella enterica , 1997, Journal of bacteriology.

[25]  T. Whittam,et al.  Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[26]  R. Durbin,et al.  Pfam: A comprehensive database of protein domain families based on seed alignments , 1997, Proteins.

[27]  H. Ochman,et al.  Comparative genetics of the inv-spa invasion gene complex of Salmonella enterica , 1997, Journal of bacteriology.

[28]  D. Alland,et al.  Multiple drug resistance: A world-wide threat , 1997 .

[29]  K. McAdam,et al.  Mycobacterial diseases part I:clinical frontiers , 1997 .

[30]  S. Cole,et al.  Physical mapping of Mycobacterium bovis BCG pasteur reveals differences from the genome map of Mycobacterium tuberculosis H37Rv and from M. bovis. , 1996, Microbiology.

[31]  G. Mahairas,et al.  Molecular analysis of genetic differences between Mycobacterium bovis BCG and virulent M. bovis , 1996, Journal of bacteriology.

[32]  D. Portnoy,et al.  The two distinct phospholipases C of Listeria monocytogenes have overlapping roles in escape from a vacuole and cell-to-cell spread , 1995, Infection and immunity.

[33]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[34]  Owen White,et al.  TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing Projects , 1995 .

[35]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[36]  D Alland,et al.  Transmission of tuberculosis in New York City. An analysis by DNA fingerprinting and conventional epidemiologic methods. , 1994, The New England journal of medicine.

[37]  J. T. Crawford,et al.  Nosocomial outbreak of tuberculosis in a renal transplant unit: application of a new technique for restriction fragment length polymorphism analysis of Mycobacterium tuberculosis isolates. , 1993, The Journal of infectious diseases.

[38]  J. T. Crawford,et al.  Strain identification of Mycobacterium tuberculosis by DNA fingerprinting: recommendations for a standardized methodology , 1993, Journal of clinical microbiology.

[39]  J. T. Crawford,et al.  Hospital outbreak of multidrug-resistant Mycobacterium tuberculosis infections. Factors in transmission to staff and HIV-infected patients. , 1992, JAMA.

[40]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[41]  M. Waterman [52] Computer analysis of nucleic acid sequences , 1988 .

[42]  M. Waterman Computer analysis of nucleic acid sequences. , 1988, Methods in enzymology.