A modified false discovery rate multiple‐comparisons procedure for discrete data, applied to human immunodeficiency virus genetics

To help to design vaccines for acquired immune deficiency syndrome that protect broadly against many genetic variants of the human immunodeficiency virus, the mutation rates at 118 positions in HIV amino-acid sequences of subtype C "versus" those of subtype B were compared. The false discovery rate (FDR) multiple-comparisons procedure can be used to determine statistical significance. When the test statistics have discrete distributions, the FDR procedure can be made more powerful by a simple modification. The paper develops a modified FDR procedure for discrete data and applies it to the human immunodeficiency virus data. The new procedure detects 15 positions with significantly different mutation rates compared with 11 that are detected by the original FDR method. Simulations delineate conditions under which the modified FDR procedure confers large gains in power over the original technique. In general FDR adjustment methods can be improved for discrete data by incorporating the modification proposed. Copyright 2005 Royal Statistical Society.

[1]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[2]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[3]  Y. Benjamini,et al.  Adaptive thresholding of wavelet coefficients , 1996 .

[4]  Feng Gao,et al.  Diversity Considerations in HIV-1 Vaccine Selection , 2002, Science.

[5]  J. Weller,et al.  A new approach to the problem of multiple comparisons in the genetic dissection of complex traits. , 1998, Genetics.

[6]  A. Tamhane,et al.  Multiple Comparison Procedures , 1989 .

[7]  T. Wrin,et al.  Genetic and immunologic characterization of viruses infecting MN-rgp120-vaccinated volunteers. , 1997, The Journal of infectious diseases.

[8]  B. Korber,et al.  HIV sequence compendium 2002 , 2002 .

[9]  Subhash R. Lele,et al.  Maximum likelihood estimation in semiparametric selection bias models with application to AIDS vaccine trials , 1999 .

[10]  M. Papathanasopoulos,et al.  Evolution and Diversity of HIV-1 in Africa – a Review , 2004, Virus Genes.

[11]  Y. Benjamini,et al.  Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics , 1999 .

[12]  T. A. Hall,et al.  BIOEDIT: A USER-FRIENDLY BIOLOGICAL SEQUENCE ALIGNMENT EDITOR AND ANALYSIS PROGRAM FOR WINDOWS 95/98/ NT , 1999 .

[13]  J. Thompson,et al.  The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. , 1997, Nucleic acids research.

[14]  Bernard Mazoyer,et al.  Functional connectivity in depressive, obsessive–compulsive, and schizophrenic disorders: an explorative correlational analysis of regional cerebral metabolism , 1998, Psychiatry Research: Neuroimaging.

[15]  John D. Storey,et al.  SAM Thresholding and False Discovery Rates for Detecting Differential Gene Expression in DNA Microarrays , 2003 .

[16]  Y. Benjamini,et al.  On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics , 2000 .

[17]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[18]  John D. Storey A direct approach to false discovery rates , 2002 .

[19]  H. Keselman,et al.  Multiple Comparison Procedures , 2005 .

[20]  Y. Benjamini,et al.  A step-down multiple hypotheses testing procedure that controls the false discovery rate under independence , 1999 .

[21]  B. Graham Clinical trials of HIV vaccines. , 2002, Annual review of medicine.

[22]  Judy Lieberman,et al.  Mapping cross-clade HIV-1 vaccine epitopes using a bioinformatics approach. , 2003, Vaccine.

[23]  P. Westfall,et al.  Multiple Tests with Discrete Distributions , 1997 .

[24]  T. Ndung’u,et al.  Magnitude and Frequency of Cytotoxic T-Lymphocyte Responses: Identification of Immunodominant Regions of Human Immunodeficiency Virus Type 1 Subtype C , 2002, Journal of Virology.

[25]  J. Booth,et al.  Resampling-Based Multiple Testing. , 1994 .

[26]  John W. Tukey,et al.  Controlling Error in Multiple Comparisons, with Examples from State-to-State Differences in Educational Achievement , 1999 .

[27]  Tarone Re A modified Bonferroni method for discrete data. , 1990 .

[28]  F. Vannberg,et al.  Human Immunodeficiency Virus Type 1 Subtype C Molecular Phylogeny: Consensus Sequence for an AIDS Vaccine Design? , 2002, Journal of Virology.

[29]  A. D. De Groot,et al.  An interactive Web site providing major histocompatibility ligand predictions: application to HIV research. , 1997, AIDS research and human retroviruses.

[30]  Peter B. Gilbert Large sample theory of maximum likelihood estimates in semiparametric biased sampling models , 2000 .

[31]  Guido Ferrari,et al.  Approaches to the development of broadly protective HIV vaccines: challenges posed by the genetic, biological and antigenic variability of HIV-1: Report* from a meeting of the WHO-UNAIDS Vaccine Advisory Committee** Geneva, 21–23 February 2000 , 2001, AIDS.

[32]  Julie McMurry,et al.  Immuno‐informatics: Mining genomes for vaccine components , 2002, Immunology and cell biology.