Single‐molecule real‐time sequencing of the M protein: Toward personalized medicine in monoclonal gammopathies

To the Editor: Each patient with a monoclonal gammopathy has a unique monoclonal (M) protein, whose sequence can be used as a tumoral fingerprint to track the presence of the B cell or plasma cell (PC) clone itself. Moreover, the M protein can directly cause potentially life-threatening organ damage, which is dictated by the specific, patient's unique clonal light and/or heavy chain amino acid sequence, as in patients affected by immunoglobulin light chain (AL) amyloidosis. However, patients' specific M protein sequences remain mostly undefined and molecular mechanisms underlying M protein-related clinical manifestations are largely obscure. We combined the unbiased amplification of expressed immunoglobulin genes through inverse PCR from circularized, doublestranded cDNA using primers annealing to the constant regions of immunoglobulin genes, with single-molecule, real-time, long-read DNA sequencing and bioinformatics and immunogenetic analyses (Online Methods, Figures S1, S2, Table S1). The resulting methodology, termed Single-Molecule Real-Time Sequencing of the M protein (SMaRT M-Seq), identifies the full-length sequence of the variable region of expressed immunoglobulin genes and ranks the obtained sequences based on their relative abundance, thus enabling the identification of the full-length variable sequence of light and/or heavy chains from a high number of patients analyzed in parallel. SMaRT M-Seq has undergone appropriate technical validation (Table S2). Sequencing of contrived bone marrow (BM) samples generated through serial dilutions of κor λ-expressing PC lines into control BM, as well as sequencing of replicate, bona fide BM samples from AL patients and comparison with gold-standard techniques of immunoglobulin gene cloning and sequencing, showed: (i) 100% sequence-accuracy at the individual base-pair level; (ii) high repeatability (coefficient of variation <0.8% for sequencing of pentaplicate BM samples) in defining the molecular clonal size (i.e., the fraction of total immunoglobulin sequences coinciding with the clonal sequence); (iii) a high sensitivity in identifying clonal immunoglobulin sequences (10 –10 3 when employing low-coverage sequencing on multiple, pooled samples) (Appendix S1, Figures S3–S5). To further extend the technical validation of the methodology and assess its throughout, we employed SMaRT M-Seq for the identification of clonal immunoglobulin sequences from BM mononuclear cells of a cohort of 89 consecutive patients with a diagnosis or a suspicion of systemic AL amyloidosis analyzed in parallel in one sequencing round (Figure S6). In 6 of these patients, comparison with standard cloning and sequencing approaches confirmed 100% identity with respect to the sequence obtained by SMaRT M-Seq (Figure S7). In addition, 3 of these patients were analyzed in duplicate with SMaRT M-Seq, and the sequence-based molecular clonal sizes of the two technical replicates were highly comparable (Figure 1). These results further confirm the accuracy and repeatability of this method also when the assay is employed to analyze a higher number of samples in parallel. Of the 89 sequenced patients, a final diagnosis of systemic AL amyloidosis could be established in 84 patients, including 5 cases with undetectable M protein by means of conventional M protein studies (Figure S8, Table S3). Of note, SMaRT M-Seq identified a dominant immunoglobulin LC sequence in all 84 patients (but not in patients analyzed in parallel where a monoclonal gammopathy was eventually excluded, Figure S9). The median molecular clonal size was 88.3% (IQR: 70.7%–93%) (Figure 1) and showed a significant correlation with the percentage of BM-PC infiltrate and with serum free LC levels (p < 0.0001 in each case) (Figure S10). Patients' clonal sequences proved to be unique (Figure S11). Germline gene usage was in agreement with the expectations for a population of patients with AL amyloidosis (Figure S12) and correlated with selected clinical features (Figure S13). As an additional way to verify the accuracy of the methodology in identifying the clonal, expressed LC, we compared the sequencing results obtained with SMaRT M-Seq on BM samples with proteomics data from matched, amyloid-containing fat tissues for 4 patients. In all cases, the expected clonal LC variable sequence as assessed by SMaRT M-Seq was the potentially amyloidogenic protein with the highest sequence coverage and was by far the first immunoglobulin LC sequence in terms of unique peptides identified compared to other, published immunoglobulin LCs (Figure S14). Collectively, these data show that SMaRT M-Seq performed on a high number of BM samples from patients with monoclonal gammopathies analyzed in parallel can accurately and reproducibly identify a clonal immunoglobulin LC sequence in all instances, even in cases with low BM-B cell/ PC clonal burden and with undetectable M protein by means of conventional diagnostic techniques. We then investigated whether the full-length variable sequence information attainable at diagnosis using SMaRT M-Seq and the use of inverse PCR coupled to short-read sequencing might enable the detection of low-level, residual clonotypic sequences, as in the context of minimal residual disease (MRD) assessment. Using contrived BM samples mimicking progressively smaller plasma cell clones, a Received: 2 March 2022 Revised: 3 August 2022 Accepted: 5 August 2022