HLA-VBSeq: accurate HLA typing at full resolution from whole-genome sequencing data

BackgroundHuman leucocyte antigen (HLA) genes play an important role in determining the outcome of organ transplantation and are linked to many human diseases. Because of the diversity and polymorphisms of HLA loci, HLA typing at high resolution is challenging even with whole-genome sequencing data.ResultsWe have developed a computational tool, HLA-VBSeq, to estimate the most probable HLA alleles at full (8-digit) resolution from whole-genome sequence data. HLA-VBSeq simultaneously optimizes read alignments to HLA allele sequences and abundance of reads on HLA alleles by variational Bayesian inference. We show the effectiveness of the proposed method over other methods through the analysis of predicting HLA types for HLA class I (HLA-A, -B and -C) and class II (HLA-DQA1,-DQB1 and -DRB1) loci from the simulation data of various depth of coverage, and real sequencing data of human trio samples.ConclusionsHLA-VBSeq is an efficient and accurate HLA typing method using high-throughput sequencing data without the need of primer design for HLA loci. Moreover, it does not assume any prior knowledge about HLA allele frequencies, and hence HLA-VBSeq is broadly applicable to human samples obtained from a genetically diverse population.

[1]  Jerzy K. Kulski,et al.  The HLA genomic loci map: expression, interaction, diversity and disease , 2009, Journal of Human Genetics.

[2]  Sue Povey,et al.  Gene map of the extended human MHC , 2004, Nature Reviews Genetics.

[3]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[4]  R. Blasczyk,et al.  Immunogenetics of HLA null alleles: implications for blood stem cell transplantation. , 2004, Tissue antigens.

[5]  Szilveszter Juhos,et al.  HLA Typing from 1000 Genomes Whole Genome and Whole Exome Illumina Data , 2013, PloS one.

[6]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[7]  Ituro Inoue,et al.  Phase-defined complete sequencing of the HLA genes by next-generation sequencing , 2013, BMC Genomics.

[8]  H. Inoko,et al.  The clinical significance of human leukocyte antigen (HLA) allele compatibility in patients receiving a marrow transplant from serologically HLA-A, HLA-B, and HLA-DR matched unrelated donors. , 2002, Blood.

[9]  James Robinson,et al.  The IMGT/HLA database , 2008, Nucleic Acids Res..

[10]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[11]  Vincent Ferretti,et al.  A predominant role for the HLA class II region in the association of the MHC region with multiple sclerosis , 2005, Nature Genetics.

[12]  J. Castle,et al.  HLA typing from RNA-Seq sequence reads , 2012, Genome Medicine.

[13]  Richard A. Moore,et al.  Derivation of HLA types from shotgun sequence datasets , 2012, Genome Medicine.

[14]  Masao Nagasaki,et al.  TIGAR: transcript isoform abundance estimation method with gapped alignment of RNA-Seq data by variational Bayesian inference , 2013, Bioinform..

[15]  A. Pérez-Lezaun,et al.  HLA class I and class II DNA typing and the origin of Basques. , 1998, Tissue antigens.

[16]  C Marks,et al.  Immunobiological determinants in organ transplantation. , 1983, Annals of the Royal College of Surgeons of England.

[17]  M. Satake,et al.  Human histocompatibility leukocyte antigen (HLA) haplotype frequencies estimated from the data on HLA class I, II, and III antigens in 111 Japanese narcoleptics. , 1985, The Journal of clinical investigation.

[18]  P Sham,et al.  A SNP resource for human chromosome 22: extracting dense clusters of SNPs from the genomic sequence. , 2001, Genome research.

[19]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[20]  M. Ni,et al.  Inference of high resolution HLA types using genome-wide RNA or DNA sequencing reads , 2014, BMC Genomics.

[21]  N. Lennon,et al.  Next-generation sequencing for HLA typing of class I loci , 2011, BMC Genomics.

[22]  J. Goedert,et al.  HLA and HIV-1: heterozygote advantage and B*35-Cw*04 disadvantage. , 1999, Science.

[23]  S. Yang,et al.  SSOP typing of the Tenth International Histocompatibility Workshop reference cell lines for HLA-C alleles. , 1994, Tissue antigens.