Higher Order Restrictions of the Immunoglobulin Repertoire in CLL: The Illustrative Case of Stereotyped Subsets #2 and #169

Stereotyped subset #2 (IGHV3-21/IGLV3-21) is the largest subset in CLL (~3% of all patients). Membership in subset #2 is clinically relevant since these patients experience an aggressive disease irrespective of the somatic hypermutation (SHM) status of the clonotypic immunoglobulin heavy variable (IGHV) gene. Low-throughput evidence suggests that stereotyped subset #169, a minor CLL subset (~0.2% of all CLL), resembles subset #2 at the immunogenetic level. More specifically: (i) the clonotypic heavy chain (HC) of subset #169 is encoded by the IGHV3-48 gene which is closely related to the IGHV3-21 gene; (ii) both subsets carry VH CDR3s comprising 9-amino acids (aa) with a conserved aspartic acid (D) at VH CDR3 position 3; (iii) both subsets bear light chains (LC) encoded by the IGLV3-21 gene with a restricted VL CDR3; and, (iv) both subsets have borderline SHM status. Here we comprehensively assessed the ontogenetic relationship between CLL subsets #2 and #169 by analyzing their immunogenetic signatures. Utilizing next-generation sequencing (NGS) we studied the HC and LC gene rearrangements of 6 subset #169 patients and 20 subset #2 cases. In brief, IGHV-IGHD-IGHJ and IGLV-IGLJ gene rearrangements were RT-PCR amplified using subgroup-specific leader primers as well as IGHJ and IGLC primers, respectively. Libraries were sequenced on the MiSeq Illumina instrument. IG sequence annotation was performed with IMGT/HighV-QUEST and metadata analysis conducted using an in-house, validated bioinformatics pipeline. Rearrangements with identical CDR3 aa sequences were herein defined as clonotypes, whereas clonotypes with different aa substitutions within the V-domain were defined as subclones. For the HC analysis of subset #169, we obtained 894,849 productive sequences (mean: 127,836, range: 87,509-208,019). On average, each analyzed sample carried 54 clonotypes (range: 44-68); the dominant clonotype had a mean frequency of 99.1% (range: 98.8-99.2%) and displayed considerable intraclonal heterogeneity with a mean of 2,641 subclones/sample (range: 1,566-6,533). For the LCs of subset #169, we obtained 2,096,728 productive sequences (mean: 299,533, range: 186,637-389,258). LCs carried a higher number of distinct clonotypes/sample compared to their partner HCs (mean: 148, range: 110-205); the dominant clonotype had a mean frequency of 98.1% (range: 97.2-98.6%). Intraclonal heterogeneity was also observed in the LCs, with a mean of 6,325 subclones/sample (range: 4,651-11,444), hence more pronounced than in their partner HCs. Viewing each of the cumulative VH and VL CDR3 sequence datasets as a single entity branching through diversification enabled the identification of common sequences. In particular, 2 VH clonotypes were present in 3/6 cases, while a single VL clonotype was present in all 6 cases, albeit at varying frequencies; interestingly, this VL CDR3 sequence was also detected in all subset #2 cases, underscoring the molecular similarities between the two subsets. Focusing on SHM, the following observations were made: (i) the frequent 3-nucleotide (AGT) deletion evidenced in the VH CDR2 of subset #2 (leading to the deletion of one of 5 consecutive serine residues) was also detected in all subset #169 cases at subclonal level (average: 6% per sample, range: 0.1-10.8%); of note, the 5-serine stretch is also present in the germline VH CDR2 of the IGHV3-48 gene; (ii) the R-to-G substitution at the VL-CL linker, a ubiquitous SHM in subset #2 and previously reported as critical for IG self-association leading to cell autonomous signaling in this subset, was present in all subset #169 samples as a clonal event with a mean frequency of 98.3%; and, finally, (iii) the S-to-G substitution at position 6 of the VL CDR3, present in all subset #2 cases (mean : 44.2% ,range: 6.3-87%), was also found in all #169 samples, representing a clonal event in 1 case (97.2% of all clonotypes) and a subclonal event in the remaining 5 cases (mean: 0.6%, range: 0.4-1.1%). In conclusion, the present high-throughput sequencing data cements the immunogenetic relatedness of CLL stereotyped subsets #2 and #169, further highlighting the role of antigen selection throughout their natural history. These findings also argue for a similar pathophysiology for these subsets that could also be reflected in a similar clonal behavior, with implications for risk stratification. Sutton: Abbvie: Honoraria; Gilead: Honoraria; Janssen: Honoraria. Stamatopoulos:Abbvie: Honoraria, Research Funding; Janssen: Honoraria, Research Funding. Chatzidimitriou:Janssen: Honoraria.