Use of average mutual information for studying changes in HIV populations

Average mutual information (AMI) has been used in a number of applications in bioinformatics. In this paper we present its use to study genetic changes in populations; in particular populations of HIV viruses. Disease progression of HIV-1 infection in infants can be rapid resulting in death within the the first year, or slow, allowing the infant to survive beyond the first year. We study the development of rapid and slow progressing HIV population using AMI charts based on average mutual information among amino acids in the env gene from a population of 1142 clones derived from seven infants with slow progressing HIV-1 infection and four infants with rapidly progressing HIV-1 infection. The AMI charts indicate the relative homogeneity of the rapid progressor populations and the much greater heterogeneity of the slow progressor population, especially in later samples. The charts also show the distinct regions of covariation between residues without the need for aligning the sequences. By examining the changes in AMI between populations we can distinguish between clones obtained from rapid progressor and slow progressor. A measure of this change can be used to enhance prediction of disease progression.

[1]  N. Slonim,et al.  Ab initio genotype–phenotype association reveals intrinsic modularity in genetic networks , 2006, Molecular systems biology.

[2]  S. Buldyrev,et al.  Species independence of mutual information in coding and noncoding DNA. , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[3]  Paul P. Gardner,et al.  Sequence analysis Measuring covariation in RNA alignments : physical realism improves information measures , 2006 .

[4]  P. Stadler,et al.  Secondary structure prediction for aligned RNA sequences. , 2002, Journal of molecular biology.

[5]  G. Ortí,et al.  Retrovirology BioMed Central , 2006 .

[6]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[7]  L. C. Martin,et al.  Using information theory to search for co-evolving residues in proteins , 2005, Bioinform..

[8]  Khalid Sayood,et al.  A divide-and-conquer approach to fragment assembly , 2003, Bioinform..

[9]  Charles Wood,et al.  Genetic variation in mother–child acute seroconverter pairs from Zambia , 2008, AIDS.

[10]  Derek Abbott,et al.  MUTUAL INFORMATION FOR EXAMINING CORRELATIONS IN DNA , 2004 .

[11]  Mark Timothy Bauer A distance measure for DNA sequences , 2001 .

[12]  Charles Wood,et al.  Phylogenetic and phenotypic analysis of HIV type 1 env gp120 in cases of subtype C mother-to-child transmission. , 2002, AIDS research and human retroviruses.

[13]  Alan S. Lapedes,et al.  Analysis of Correlations Between Sites in Models of Protein Sequences , 1998 .

[14]  Ramón Román-Roldán,et al.  Application of information theory to DNA sequence analysis: A review , 1996, Pattern Recognit..

[15]  A. Lapedes,et al.  Covariation of mutations in the V3 loop of human immunodeficiency virus type 1 envelope protein: an information theoretic analysis. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Hanspeter Herzel,et al.  Correlations in DNA sequences: The role of protein coding segments , 1997 .