Detecting Positively Selected Amino Acid Sites Using Posterior Predictive P-Values

Identifying positively selected amino acid sites is an important approach for making inference about the function of proteins; an amino acid site that is undergoing positive selection is likely to play a key role in the function of the protein. We present a new Bayesian method for identifying positively selected amino acid sites and apply the method to a data set of hemagglutinin sequences from the Influenza virus. We show that the results of the new methods are in accordance with results obtained using previous methods. More importantly, we also demonstrate how the method can be used for making further inferences about the evolutionary history of the sequences. For example, we demonstrate that sites that are positively selected tend to have a preponderance of conservative amino acid substitutions.