The ability to generate a specific and long-lived antibody response is a key element of acquired immunity and is a necessary component for the prevention or resolution of disease caused by most viruses [1]. Specificity of antibody responses for particular pathogens is achieved by the development of a diverse repertoire of recombined antibody variable genes that encode antibodies that can recognize an enormous number of potential epitopes. Diversity in the antigen combining site of the B cell receptor repertoire (and thus also in the corresponding secreted antibody repertoire) is mediated by three principal mechanisms that are illustrated in Figure 1: (1) random pairing of heavy and light chains to form the antigen-binding site in the immunoglobulin molecule; (2) combinatorial diversity generated by V(D)J recombination, which together with heavy and light chain pairing results in approximately 2.3 × 106 different possible combinations; (3) junctional diversity generated by P- and N-nucleotide addition or deletion at recombination sites during V(D)J processing by isoforms of the enzyme terminal deoxynucleotidyl transferase (TdT), which theoretically results in 1011 different antibody specificities [2]. Somatic hypermutation, a fourth mechanism of diversification, introduces point mutations into the rearranged immunoglobulin variable domain after B cell activation. Additional functional diversity in secreted antibodies is conferred by differences between isotypes after class switching, since the Fc region of immunoglobulins determines the valency of the antibody combining sites and many functions such as complement fixation, and interaction with various Fc receptors or the polyimmunoglubilin receptor. Following diversification of the repertoire, longevity of particular B cells is mediated by complex regulatory functions.
Figure 1
Diversity in the antigen-combining site of the B cell receptor repertoire (and thus also in the corresponding secreted antibody repertoire) is mediated by three principal molecular mechanisms, illustrated in the three panels, left, middle, and right.
In years past, immunologists understood diversification of B cell populations specific to particular foreign antigens to involved a burst of diversification within a clone of B cells in the activated germinal center, followed by a selection for survival of the highest affinity clone and drastic loss of related somatic variants with lower affinity. Although this “single winner” model did correctly describe the typical panel of B cell clones isolated from experimental studies using isolation of hybridomas and monoclonal antibodies (mAbs), the technical approach to isolation of mAbs likely biased such studies toward the isolation of only the most avidly binding antibodies. Emerging techniques using high-throughput DNA and RNA sequence analysis are increasingly revealing that this paradigm is not correct, and instead human B cell repertoires maintain very large populations of somatic variants within clones [3]; see Figure 2. It may seem metabolically wasteful and counter-intuitive that the immune system would allow hundreds or thousands of related clones to persist in circulation when many of those variants possess many fewer somatic mutations than the most mature clones, and thus by inference likely have lower affinity of binding for the inciting epitope. There may be method in this madness, however, if persisting diversity in the B cell repertoire allows the subject to respond to antigenic variation in the target, such as antigenic drift in acute infections like influenza or persistent escape by point mutations during chronic infections with viruses like HIV-1 or hepatitis C. Dealing with the enormous sequence and structural plasticity of the protective antigens of these viruses (such as influenza hemagglutinin, HIV-1 gp140, or hepatitis envelope protein) likely requires an equivalent breadth of diversity of antigen combining sites in the responding B cell population. Therefore, recent observations that human B cell repertoires engage pathogens with large clonal families of highly related combining sites, which we have termed “antibody swarms”, makes sense from a strategic standpoint for the immune system. Studying the diverse antibody response to antigen as a swarming population instead of as a one-to-one, specific interaction informs our understanding of disease and immunity in a new way. In recent years, key studies have leveraged new technological advances in gene sequencing and microfluidics to provide evidence regarding the mechanisms of repertoire diversification, the size of the antibody repertoire and methods of repertoire regulation shared by different individuals. These studies are the foundation upon which further applications will be developed.
Figure 2
[A] Classical models of somatic hypermutation conceive of rapid generation of variants in the activated germinal center followed by a severe down-selection of number of variants, resulting in selection of only the clones with the most avidly binding B ...
Sequencing the antibody variable gene repertoire
Many next-generation sequencing techniques are available today; specifications for three of the most commonly used current techniques are detailed in Table 1. No doubt the capabilities and proprietary formats of these types of technologies will continue to evolve rapidly. These methods can be used to determine the sequence of recombined antibody variable genes amplified from primary cell or tissue samples, generating large sequence databases. It is possible to sequence recombined genes isolated from genomic DNA by PCR, or from transcribed genes using cDNA made from mRNA by reverse transcription and amplified by PCR. The resulting amplicon sequences are determined by high-throughput amplicon DNA sequencing technologies, and then analyzed with some type of specialized antibody variable gene sequence analysis software platform. Several web-based software approaches to antibody gene analysis are available, such as IMGT V-QUEST, SoDA and JOINSOLVER, which identify the inferred V, D and J gene segments used during recombination and resolve P- and N-nucleotides, providing robust data for further study [4–6].
Table 1
Characteristics of three of the most commonly used current next-generation sequencing techniques
Antibody heavy and light chain pairing is an important aspect of the diversification of the antibody repertoire, and it has been shown that antibody heavy chains are capable of pairing with many light chains [7]. Therefore, identifying the correct heavy and light chain pairing partners during repertoire sequencing will be of critical importance to future efforts to understand repertoire diversity. Currently, technical limitations prevent large-scale sequence analysis of naturally paired heavy and light chain genes. There are two principal approaches that are being pursued currently to accomplish the task of pairing heavy and light chain genes on a massive scale. The first approach aims to pair the heavy and light chain sequences from separately sequenced repertoires using informatic approximations, while the other approach aims to link the sequences during variable gene amplification by PCR, followed by sequence analysis of both chains in one amplicon.
Indexed sequencing protocols can be readily applied to barcode both the heavy and light chain sequences from a single sample, after which the heavy and light chain sequences can be paired. One study paired heavy and light chain variable gene sequences according to their relative frequencies within the repertoire, with a majority (21/27, or 78%) of the pairings tested generating antigen-specific antibodies [8]. A second study found that heavy and light chain pairs could be identified often using an evolution-based analysis, wherein coevolution of the heavy and light chains resulted in correlations between both the frequency and topology of the corresponding phylogenetic tree branches [7]. In either case, although these techniques may allow isolation of antigenic binding antibodies, they do not assuredly retain endogenous the original heavy and light chain gene pairing information.
Recently, techniques were developed to retain the endogenous pairing information by linking the heavy and light chains during gene amplification [9,10]. In one study, single B cells were lysed in isolation using a high-density microwell plate, after which mRNA transcripts were captured on magnetic beads for emulsion PCR amplification with linking primers [9]. This process annealed the heavy and light chain complementarity determining region 3 (CDRH3 and CDRL3, respectively) sequences together into one amplicon for next-generation sequencing. A similar technique was employed by another study, which used advances in microfluidics to successfully accomplish on-chip single-cell RT-qPCR [10]. While published results are limited to 300 single-cell RT-qPCR measurements per run, the success of this protocol suggests that the chip could be scaled up to more than 1,000 measurements per chip. While these techniques likely highlight the future of antibody repertoire studies, the current read lengths of next generation sequencing limits the application to only CDRH3:CDRL3 paired sequences. Longer read lengths will be required to identify full-length antibody variable gene sequences that can be used to synthesize cDNA encoding the native sequence of the original antibody including all six CDRs.
[1]
George Georgiou,et al.
High-throughput sequencing of the paired human immunoglobulin heavy and light chain repertoire
,
2013,
Nature Biotechnology.
[2]
M. Egholm,et al.
Measurement and Clinical Monitoring of Human Lymphocyte Clonality by Massively Parallel V-D-J Pyrosequencing
,
2009,
Science Translational Medicine.
[3]
James E Crowe,et al.
Epitope-Specific Human Influenza Antibody Repertoires Diversify by B Cell Intraclonal Sequence Divergence and Interclonal Convergence
,
2011,
The Journal of Immunology.
[4]
R. Ahmed,et al.
Immunological memory in humans.
,
2004,
Seminars in immunology.
[5]
Jérôme Lane,et al.
IMGT®, the international ImMunoGeneTics information system®
,
2004,
Nucleic Acids Res..
[6]
Seung Hyun Kang,et al.
Monoclonal antibodies isolated without screening by analyzing the variable-gene repertoire of plasma cells
,
2010,
Nature Biotechnology.
[7]
Samuel Aparicio,et al.
High-throughput microfluidic single-cell RT-qPCR
,
2011,
Proceedings of the National Academy of Sciences.
[8]
Baoshan Zhang,et al.
Mining the antibodyome for HIV-1–neutralizing antibodies with next-generation sequencing and phylogenetic pairing of heavy/light chains
,
2013,
Proceedings of the National Academy of Sciences.
[9]
Thomas B. Kepler,et al.
SoDA2: a Hidden Markov Model approach for identification of immunoglobulin rearrangements
,
2010,
Bioinform..
[10]
Gérard Lefranc,et al.
The Immunoglobulin FactsBook
,
2001
.
[11]
Jordan R. Willis,et al.
Frequency and genetic characterization of V(DD)J recombinants in the human peripheral blood antibody repertoire
,
2012,
Immunology.
[12]
Samuel L. DeLuca,et al.
Human Germline Antibody Gene Segments Encode Polyspecific Antibodies
,
2013,
PLoS Comput. Biol..
[13]
B A McKinney,et al.
High-throughput antibody sequencing reveals genetic evidence of global regulation of the naïve and memory repertoires that extends across individuals
,
2012,
Genes and Immunity.
[14]
James E. Crowe,et al.
Human Peripheral Blood Antibodies with Long HCDR3s Are Established Primarily at Original Recombination Using a Limited Subset of Germline Genes
,
2012,
PloS one.
[15]
Vincent Lombard,et al.
Genome sequence of the model mushroom Schizophyllum commune
,
2010,
Nature Biotechnology.
[16]
James E. Crowe,et al.
Location and length distribution of somatic hypermutation-associated DNA insertions and deletions reveals regions of antibody structural plasticity
,
2012,
Genes and Immunity.
[17]
C. Nusbaum,et al.
High-Resolution Description of Antibody Heavy-Chain Repertoires in Humans
,
2011,
PloS one.
[18]
Michael W. McCormick,et al.
Shaping of Human Germline IgH Repertoires Revealed by Deep Sequencing
,
2012,
The Journal of Immunology.
[19]
P. Lipsky,et al.
Characterization of the Human Ig Heavy Chain Antigen Binding Complementarity Determining Region 3 Using a Newly Developed Software Algorithm, JOINSOLVER
,
2004,
The Journal of Immunology.
[20]
Sean A Beausoleil,et al.
A proteomics approach for the identification and cloning of monoclonal antibodies from serum
,
2012,
Nature Biotechnology.