An Efficient Test for Comparing Sequence Diversity between Two Populations

We address the problem of comparing interindividual genomic sequence diversity between two populations. Although the methods are general, for concreteness we focus on comparing two human immunodeficiency virus (HIV) infected populations. From a viral isolate(s) taken from each individual in a sample of persons from each population, suppose one or multiple measurements are made on the genetic sequence of a coding region of HIV. Given a definition of genetic distance between sequences, the goal is to test if the distribution of interindividual distances differs between populations. If distances between all pairs of sequences within each group are used, then data-dependencies arising from the use of multiple sequences from individuals invalidates the use of a standard two-sample test such as the t-test. Where this problem has been recognized, a typical solution has been to apply a standard test to a reduced dataset comprised of one sequence or a consensus sequence from each patient. Disadvantages of this procedure are that the conclusion of the test depends on the choice of utilized sequences, often an arbitrary decision, and exclusion of replicate sequences from the analysis may needlessly sacrifice statistical power. We present a new test free of these drawbacks, which is based on a statistic that linearly combines all possible standard test statistics calculated from independent sequence subsamples. We describe statistical power advantages of the test and illustrate its use by application to nucleotide sequence distances measured from HIV-1 infected populations in southern Africa (GenBank accession numbers AF110959--AF110981) and North America/Europe. The test makes minimal assumptions, is maximally efficient and objective, and is broadly applicable.

[1]  E. Lehmann,et al.  Nonparametrics: Statistical Methods Based on Ranks , 1976 .

[2]  J. Nkengasong,et al.  The puzzle of HIV‐1 subtypes in Africa , 1997, AIDS.

[3]  S. Zolla-Pazner,et al.  The implications of antigenic diversity for vaccine development. , 1999, Immunology letters.

[4]  Gilcher Ro Human retroviruses and AIDS. , 1988 .

[5]  S. Wain-Hobson Virological mayhem , 1995, Nature.

[6]  S. Eddy Hidden Markov models. , 1996, Current opinion in structural biology.

[7]  B. Korber,et al.  Diversity of V3 region sequences of human immunodeficiency viruses type 1 from the central African Republic. , 1993, AIDS research and human retroviruses.

[8]  Steven M. Wolinsky,et al.  Adaptive Evolution of Human Immunodeficiency Virus-Type 1 During the Natural Course of Infection , 1996, Science.

[9]  Lee-Jen Wei,et al.  Combining dependent tests with incomplete repeated measurements , 1985 .

[10]  B. Korber,et al.  Signature pattern analysis: a method for assessing viral sequence relatedness. , 1992, AIDS research and human retroviruses.

[11]  M. Essex State of the HIV pandemic. , 1998, Journal of human virology.

[12]  G. Learn,et al.  Mother-to-infant transmission of human immunodeficiency virus type 1 involving five envelope sequence subtypes , 1997, Journal of virology.

[13]  D. Burke,et al.  Genetic variants of HIV-1 in Thailand. , 1992, AIDS research and human retroviruses.

[14]  R. F. Smith,et al.  Mutational trends in V3 loop protein sequences observed in different genetic lineages of human immunodeficiency virus type 1 , 1994, Journal of virology.

[15]  P B Gilbert,et al.  Statistical methods for assessing differential vaccine protection against human immunodeficiency virus types. , 1998, Biometrics.

[16]  E. Pitman TESTS OF HYPOTHESES CONCERNING LOCATION AND SCALE PARAMETERS , 1939 .

[17]  S. Matsushita,et al.  Characterization of proviral DNA from an individual with long-term, nonprogressive infection with HIV-1 and nonrecoverable virus. , 1997, Journal of acquired immune deficiency syndromes and human retrovirology : official publication of the International Retrovirology Association.

[18]  F. Vannberg,et al.  Molecular Cloning and Phylogenetic Analysis of Human Immunodeficiency Virus Type 1 Subtype C: a Set of 23 Full-Length Clones from Botswana , 1999, Journal of Virology.

[19]  A. Lapedes,et al.  Covariation of mutations in the V3 loop of human immunodeficiency virus type 1 envelope protein: an information theoretic analysis. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[20]  E. Halapi,et al.  Comparison of variable region 3 sequences of human immunodeficiency virus type 1 from infected children with the RNA and DNA sequences of the virus populations of their mothers. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Subhash R. Lele,et al.  Maximum likelihood estimation in semiparametric selection bias models with application to AIDS vaccine trials , 1999 .

[22]  B. Korber,et al.  Analysis of intercurrent human immunodeficiency virus type 1 infections in phase I and II trials of candidate AIDS vaccines. AIDS Vaccine Evaluation Group, and the Correlates of HIV Immune Protection Group. , 1998, The Journal of infectious diseases.

[23]  A. Fomsgaard,et al.  HIV-1 DNA vaccines. , 1999, Immunology letters.

[24]  D. Gotte,et al.  Evolution and probable transmission of intersubtype recombinant human immunodeficiency virus type 1 in a Zambian couple , 1997, Journal of virology.

[25]  M H Gail,et al.  Applicability of sample size calculations based on a comparison of proportions for use with the logrank test. , 1985, Controlled clinical trials.

[26]  J. Goudsmit,et al.  Genomic diversity and antigenic variation of HIV‐1: links between pathogenesis, epidemiology and vaccine development , 1991, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[27]  C. Williamson,et al.  An association between HIV‐1 subtypes and mode of transmission in Cape Town, South Africa , 1997, AIDS.

[28]  C. Chappey,et al.  Genetic analysis of human immunodeficiency virus type 1 envelope V3 region isolates from mothers and infants after perinatal transmission , 1995, Journal of virology.

[29]  T. Wrin,et al.  Genetic and immunologic characterization of viruses infecting MN-rgp120-vaccinated volunteers. , 1997, The Journal of infectious diseases.

[30]  J. Mascola,et al.  Diversity of the envelope glycoprotein among human immunodeficiency virus type 1 isolates of clade E from Asia and Africa , 1996, Journal of virology.

[31]  W. Becker,et al.  Analysis of partial gag and env gene sequences of HIV type 1 strains from southern Africa. , 1995, AIDS research and human retroviruses.

[32]  A. Trkola,et al.  Immunological and Virological Analyses of Persons Infected by Human Immunodeficiency Virus Type 1 while Participating in Trials of Recombinant gp120 Subunit Vaccines , 1998, Journal of Virology.

[33]  P. Sonnenberg,et al.  Genetic characterization of HIV type 1 from migrant workers in three South African gold mines. , 1998, AIDS research and human retroviruses.

[34]  C. Williamson,et al.  HIV-1 subtypes in different risk groups in South Africa , 1995, The Lancet.

[35]  U. Pettersson,et al.  Ugandan HIV-1 V3 loop sequences closely related to the U.S./European consensus. , 1992, Virology.