Introduction: Mature normal and tumor B cells express a unique rearranged immunoglobulin (IG) gene that can be used as a marker of the clonal expansion of the cell. Somatic hypermutation (SHM) in the V(D)J region of IG genes are acquired in the germinal center and are a surrogate imprint of the cell of origin of lymphoid neoplasms. In chronic lymphocytic leukemia (CLL) and mantle cell lymphoma (MCL), the identification of SHM distinguishes subtypes of tumors with different clinical and biological behavior. Although still less studied, cases carrying highly similar IG sequences (i.e. stereotyped IG), specific light chain (LC) rearrangements, and the presence of class switch recombination (CSR) of the constant region of the heavy chains further sub-classify patients into potentially distinct clinico-biological subgroups. The analysis of the rearranged IG gene is currently performed by specific Sanger sequencing (SSeq) or next-generation sequencing protocols. Whole-genome sequencing (WGS) of B-cell neoplasms should store the information to reconstruct the entire rearranged IG gene [heavy (IGH) and kappa or lambda (IGK, IGL) LC]. However, the high genomic complexity and homology of these regions have prevented the analysis of the rearranged IG genes in WGS using standard bioinformatics pipelines.
Aim: To assess the use of WGS data to fully characterize the rearranged IG gene in B-cell neoplasms.
Methods: We developed IgCaller, a fast, easy-to-run program that uses already aligned WGS data to dissect the rearranged IGH V(D)J genes, IGK and IGL VJ genes, and the presence of constant heavy chain CSR. IgCaller also determines the homology of the rearranged sequences compared to the patient's germ line or reference genome. We demonstrated the accuracy of IgCaller using WGS data of 331 B-cell neoplasms [240 CLL (152 cohort 1 (C1)-CLL; 88 independent cohort 2 (C2)-CLL), 61 MCL, and 30 multiple myeloma (MM)] and compared with SSeq of the IGH V(D)J and/or LC and isotype expression.
Results: IgCaller identified a complete IGH productive rearrangement [V(D)J] in 133 (88%) C1-CLL, 80 (91%) C2-CLL, 61 (100%) MCL, and 21 (70%) MM. A partial (VJ) rearrangement was detected in 8 (5%) C1-CLL and 1 (3%) MM. Available SSeq of the V(D)J or at least V gene for 131 C1-CLL, 10 C2-CLL, and 60 MCL successfully characterized by WGS highlighted only one discordant V(D)J rearrangement. Small discrepancies (only J or V disagreement) were found when the J (n=4) or V (n=1) gene identified by SSeq based on homology (IMGT/V-QUEST tool) did not correspond to the rearranged gene detected by WGS, which was the second scoring gene in IMGT/V-QUEST suggesting that our non-homology WGS-based approach might be more accurate. Of note, IgCaller identified the presence of two distinct IGH subclones in 1 case. Next, the comparison of the percentage of homology of the rearranged sequence to the germ line in 131 C1-CLL and 60 MCL with complete V gene both by SSeq and WGS showed a high correlation and concordance in both cohorts [R>0.95, p<1e-30; Passing Bablok regression: 0.05+1*SSeq (CLL), -0.19+1*SSeq (MCL)]. Only 1 CLL patient was differentially classified as mutated or unmutated between SSeq (complete rearrangement) and WGS (partial rearrangement) using the cut off of 98%.
A productive IGK or IGL was found in 147 (97%) C1-CLL, 85 (97%) C2-CLL, 61 (100%) MCL, and 26 (87%) MM. These results fully agreed with the LC expression observed by flow cytometry (FC). Of clinical relevance, a total of 26/232 (11%) CLL carried IGLV3-21. The presence of IGLV3-21 was associated with a shorter time to first treatment (TTFT) independently of the IGHV status (p=0.045).
Finally, IgCaller identified the CSR matching the isotype expressed by FC analysis in 26/30 (87%) MM. One of the 4 discordant cases with IgM by WGS expressed only LC by FC. Overall, IgCaller could not identify the isotype switch in 3 (10%) MM expressing IgG or IgA. In CLL, CSR was observed in 47/240 (20%) cases. Noteworthy, the presence of CSR identified 24% of M-CLL patients with a shorter TTFT than non-switched M-CLL (p=0.004), and similar to that of unmutated IGHV cases (p=0.15).
Conclusions: IgCaller successfully characterized the entire IG gene of >90% B-cell neoplasms studied. The complete characterization of the rearranged IG gene based on WGS data, when available, could facilitate the analysis of LC rearrangements and CSR, and replace the traditional SSeq of the IG loci both in research and clinical settings.
No relevant conflicts of interest to declare.
Asterisk with author names denotes non-ASH members.