VDJbase: an adaptive immune receptor genotype and haplotype database

Abstract VDJbase is a publicly available database that offers easy searching of data describing the complete sets of gene sequences (genotypes and haplotypes) inferred from adaptive immune receptor repertoire sequencing datasets. VDJbase is designed to act as a resource that will allow the scientific community to explore the genetic variability of the immunoglobulin (Ig) and T cell receptor (TR) gene loci. It can also assist in the investigation of Ig- and TR-related genetic predispositions to diseases. Our database includes web-based query and online tools to assist in visualization and analysis of the genotype and haplotype data. It enables users to detect those alleles and genes that are significantly over-represented in a particular population, in terms of genotype, haplotype and gene expression. The database website can be freely accessed at https://www.vdjbase.org/, and no login is required. The data and code use creative common licenses and are freely downloadable from https://bitbucket.org/account/user/yaarilab/projects/GPHP.

[1]  Yan Wang,et al.  Many human immunoglobulin heavy‐chain IGHV gene polymorphisms have been reported in error , 2008, Immunology and cell biology.

[2]  K. Roskin,et al.  Single B-cell deconvolution of peanut-specific antibody responses in allergic patients. , 2016, The Journal of allergy and clinical immunology.

[3]  Steven H. Kleinstein,et al.  Inferred Allelic Variants of Immunoglobulin Receptor Genes: A System for Their Evaluation, Documentation, and Naming , 2019, Front. Immunol..

[4]  Steven H. Kleinstein,et al.  Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data , 2015, Bioinform..

[5]  Fredrik Levander,et al.  Parallel antibody germline gene and haplotype analyses support the validity of immunoglobulin germline gene inference and discovery , 2017, Molecular immunology.

[6]  Jamie K. Scott,et al.  Complete haplotype sequence of the human immunoglobulin heavy-chain variable, diversity, and joining genes and characterization of allelic and copy-number variation. , 2013, American journal of human genetics.

[7]  G. B. Karlsson Hedestam,et al.  Production of individualized V gene databases reveals high levels of immunoglobulin genetic diversity , 2016, Nature Communications.

[8]  Steven H. Kleinstein,et al.  Models of Somatic Hypermutation Targeting and Substitution Based on Synonymous Mutations from High-Throughput Immunoglobulin Sequencing Data , 2013, Front. Immunol..

[9]  Julian Q. Zhou,et al.  Dysregulation of B Cell Repertoire Formation in Myasthenia Gravis Patients Revealed through Deep Sequencing , 2017, The Journal of Immunology.

[10]  Christopher Clouser,et al.  Mosaic deletion patterns of the human antibody heavy chain gene locus shown by Bayesian haplotyping , 2019, Nature Communications.

[11]  Steven H. Kleinstein,et al.  Identification of Subject-Specific Immunoglobulin Alleles From Expressed Repertoire Sequencing Data , 2018, bioRxiv.

[12]  Steven H. Kleinstein,et al.  A spectral clustering-based method for identifying clones from high-throughput B cell repertoire sequencing data , 2018, Bioinform..

[13]  David Kipling,et al.  Ageing of the B-cell repertoire , 2015, Philosophical Transactions of the Royal Society B: Biological Sciences.

[14]  Jérôme Lane,et al.  IMGT®, the international ImMunoGeneTics information system® , 2004, Nucleic Acids Res..

[15]  G. Yaari,et al.  Practical guidelines for B-cell receptor repertoire sequencing analysis , 2015, Genome Medicine.

[16]  Jacob Glanville,et al.  The Individual and Population Genetics of Antibody Immunity , 2017, Trends in Immunology.

[17]  S. Munir Alam,et al.  Antibody‐virus co‐evolution in HIV infection: paths for HIV vaccine development , 2017, Immunological reviews.

[18]  Felix Breden,et al.  IGHV1-69 polymorphism modulates anti-influenza antibody repertoires, correlates with IGHV utilization shifts and varies by ethnicity , 2016, Scientific Reports.

[19]  IV FrederickA.Matsen,et al.  Consistency of VDJ Rearrangement and Substitution Parameters Enables Accurate B Cell Receptor Sequence Annotation , 2015, PLoS Comput. Biol..

[20]  W. Pomat,et al.  Divergent human populations show extensive shared IGK rearrangements in peripheral blood B cells , 2011, Immunogenetics.

[21]  Syed Ahmad Chan Bukhari,et al.  Reproducibility and Reuse of Adaptive Immune Receptor Repertoire Data , 2017, Front. Immunol..

[22]  Gur Yaari,et al.  Analysis of Celiac Disease Autoreactive Gut Plasma Cells and Their Corresponding Memory Compartment in Peripheral Blood Using High-Throughput Sequencing , 2015, The Journal of Immunology.

[23]  S. Quake,et al.  The promise and challenge of high-throughput sequencing of the antibody repertoire , 2014, Nature Biotechnology.

[24]  Lisa E. Wagar,et al.  Shaping of infant B cell receptor repertoires by environmental factors and infectious disease , 2019, Science Translational Medicine.

[25]  W. Robinson Sequencing the functional antibody repertoire—diagnostic and therapeutic discovery , 2015, Nature Reviews Rheumatology.

[26]  Steven H. Kleinstein,et al.  B cells populating the multiple sclerosis brain mature in the draining cervical lymph nodes , 2014, Science Translational Medicine.

[27]  G. Yaari,et al.  Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles , 2015, Proceedings of the National Academy of Sciences.

[28]  Thomas B Kepler,et al.  B-cell–lineage immunogen design in vaccine development with HIV-1 as a case study , 2012, Nature Biotechnology.

[29]  Gur Yaari,et al.  RAbHIT: R Antibody Haplotype Inference Tool , 2019, Bioinform..

[30]  Ning Ma,et al.  IgBLAST: an immunoglobulin variable domain sequence analysis tool , 2013, Nucleic Acids Res..

[31]  Ali Bashir,et al.  Comment on “A Database of Human Immune Receptor Alleles Recovered from Population Sequencing Data” , 2017, The Journal of Immunology.

[32]  Adrian W. Briggs,et al.  Neutralizing antibodies against West Nile virus identified directly from human B cells by single-cell analysis and next generation sequencing. , 2015, Integrative biology : quantitative biosciences from nano to macro.

[33]  L. Stamatatos,et al.  Differences in Allelic Frequency and CDRH3 Region Limit the Engagement of HIV Env Immunogens by Putative VRC01 Neutralizing Antibody Precursors. , 2016, Cell reports.

[34]  Mark M. Tanaka,et al.  The Inference of Phased Haplotypes for the Immunoglobulin H Chain V Region Gene Loci by Analysis of VDJ Gene Rearrangements , 2012, The Journal of Immunology.