A Sliding Window-Based Method to Detect Selective Constraints in Protein-Coding Genes and Its Application to RNA Viruses

Here we present a new sliding window-based method specially designed to detect selective constraints in specific regions of a multiple protein-coding sequence alignment. In contrast to previous window-based procedures, our method is based on a nonarbitrary statistical approach to find the appropriate codon-window size to test deviations of synonymous (dS) and nonsynonymous (dN) nucleotide substitutions from the expectation. The probabilities of dN and dS are obtained from simulated data and used to detect significant deviations of dN and dS in a specific window region of the real sequence alignment. The nonsynonymous-to-synonymous rate ratio (w = dN/dS) was used to highlight selective constraints in any window wherein dS or dN was significantly different from the expectation. In these significant windows, w and its variance [V(w)] were calculated and used to test the neutral hypothesis. Computer simulations showed that the method is accurate even for highly divergent sequences. The main advantages of the new method are that it (i) uses a statistically appropriate window size to detect different selective patterns, (ii) is computationally less intensive than maximum likelihood methods, and (iii) detects saturation of synonymous sites, which can give deviations from neutrality. Hence, it allows the analysis of highly divergent sequences and the test of different alternative hypothesis as well. The application of the method to different human immunodeficiency virus type 1 and to foot-and-mouth disease virus genes confirms the action of positive selection on previously described regions as well as on new regions.

[1]  G. Wetherill,et al.  Statistical Theory and Methodology in Science and Engineering. , 1962 .

[2]  H. Akashi,et al.  Within- and between-species DNA sequence variation and the 'footprint' of natural selection. , 1999, Gene.

[3]  S. Palumbi,et al.  Positive selection and sequence rearrangements generate extensive polymorphism in the gamete recognition protein bindin. , 1996, Molecular biology and evolution.

[4]  G. Churchill,et al.  THE RECONSTRUCTION OF ANCESTRAL CHARACTER STATES , 1996, Evolution; international journal of organic evolution.

[5]  Y. Ina,et al.  New methods for estimating the numbers of synonymous and nonsynonymous substitutions , 1995, Journal of Molecular Evolution.

[6]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[7]  N. Goldman,et al.  Codon-substitution models for heterogeneous selection pressure at amino acid sites. , 2000, Genetics.

[8]  P. Chambon,et al.  A superfamily of potentially oncogenic hormone receptors. , 1986, Nature.

[9]  Sudhir Kumar,et al.  MEGA2: molecular evolutionary genetics analysis software , 2001, Bioinform..

[10]  Ignacio Marín,et al.  Detecting Changes in the Functional Constraints of Paralogous Genes , 2001, Journal of Molecular Evolution.

[11]  Z. Yang,et al.  Maximum-likelihood analysis of molecular adaptation in abalone sperm lysin reveals variable selective pressures among lineages and sites. , 2000, Molecular biology and evolution.

[12]  T Gojobori,et al.  Large-scale search for genes on which positive selection may operate. , 1996, Molecular biology and evolution.

[13]  T Gojobori,et al.  A method for detecting positive selection at single amino acid sites. , 1999, Molecular biology and evolution.

[14]  S. Karlin,et al.  Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[15]  R. Nielsen,et al.  Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. , 1998, Genetics.

[16]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[17]  M. Nei,et al.  Positive Darwinian selection after gene duplication in primate ribonuclease genes. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[18]  N. Bianchi,et al.  Evolution of the Zfx and Zfy genes: rates and interdependence between the genes. , 1993, Molecular biology and evolution.

[19]  M. Nei,et al.  Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection , 1988, Nature.

[20]  A. Hughes,et al.  Natural selection on the gag, pol, and env genes of human immunodeficiency virus 1 (HIV-1). , 1995, Molecular biology and evolution.

[21]  M. Nei,et al.  Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. , 1986, Molecular biology and evolution.

[22]  M. Nei,et al.  Nucleotide substitution at major histocompatibility complex class II loci: evidence for overdominant selection. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[23]  E. Holmes,et al.  Genealogical evidence for positive selection in the nef gene of HIV-1. , 1999, Genetics.