The HIV Mutation Browser: A Resource for Human Immunodeficiency Virus Mutagenesis and Polymorphism Data

Huge research effort has been invested over many years to determine the phenotypes of natural or artificial mutations in HIV proteins—interpretation of mutation phenotypes is an invaluable source of new knowledge. The results of this research effort are recorded in the scientific literature, but it is difficult for virologists to rapidly find it. Manually locating data on phenotypic variation within the approximately 270,000 available HIV-related research articles, or the further 1,500 articles that are published each month is a daunting task. Accordingly, the HIV research community would benefit from a resource cataloguing the available HIV mutation literature. We have applied computational text-mining techniques to parse and map mutagenesis and polymorphism information from the HIV literature, have enriched the data with ancillary information and have developed a public, web-based interface through which it can be intuitively explored: the HIV mutation browser. The current release of the HIV mutation browser describes the phenotypes of 7,608 unique mutations at 2,520 sites in the HIV proteome, resulting from the analysis of 120,899 papers. The mutation information for each protein is organised in a residue-centric manner and each residue is linked to the relevant experimental literature. The importance of HIV as a global health burden advocates extensive effort to maximise the efficiency of HIV research. The HIV mutation browser provides a valuable new resource for the research community. The HIV mutation browser is available at: http://hivmut.org.

[1]  K. Bretonnel Cohen,et al.  MutationFinder: a high-performance system for extracting point mutation mentions from text , 2007, Bioinform..

[2]  L. Kallings,et al.  The first postmodern pandemic: 25 years of HIV/ AIDS , 2008, Journal of internal medicine.

[3]  P. Tompa,et al.  The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. , 2005, Journal of molecular biology.

[4]  María Martín,et al.  Activities at the Universal Protein Resource (UniProt) , 2013, Nucleic Acids Res..

[5]  Michael Kuhn,et al.  Reflect: augmented browsing for the life scientist , 2009, Nature Biotechnology.

[6]  David S. Goodsell,et al.  The RCSB Protein Data Bank: new resources for research and education , 2012, Nucleic Acids Res..

[7]  Bryan Chan,et al.  Human immunodeficiency virus reverse transcriptase and protease sequence database , 2003, Nucleic Acids Res..

[8]  R. Benarous,et al.  Vpu Antagonizes BST-2–Mediated Restriction of HIV-1 Release via β-TrCP and Endo-Lysosomal Trafficking , 2009, PLoS pathogens.

[9]  Olivier Bodenreider,et al.  Toward an automatic method for extracting cancer- and other disease-related point mutations from the biomedical literature , 2011, Bioinform..

[10]  A. Kouznetsov,et al.  Algorithms and semantic infrastructure for mutation impact extraction and grounding , 2010, BMC Genomics.

[11]  Alfonso Valencia,et al.  Extraction of human kinase mutations from literature, databases and genotyping studies , 2009, BMC Bioinformatics.

[12]  Yasunori Yamamoto,et al.  Allie: a database and a search service of abbreviations and long forms , 2011, Database J. Biol. Databases Curation.

[13]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[14]  Richard Benarous,et al.  Regulated Degradation of the HIV-1 Vpu Protein through a βTrCP-Independent Pathway Limits the Release of Viral Particles , 2007, PLoS pathogens.

[15]  Kallings Lo,et al.  The first postmodern pandemic: 25 years of HIV/ AIDS. , 2008 .

[16]  Ignacio E. Sánchez,et al.  The eukaryotic linear motif resource ELM: 10 years and counting , 2013, Nucleic Acids Res..

[17]  René Witte,et al.  Towards a Systematic Evaluation of protein Mutation Extraction Systems , 2007, J. Bioinform. Comput. Biol..

[18]  G. Casari,et al.  Automatic extraction of mutations from Medline and cross-validation with OMIM. , 2004, Nucleic acids research.