Chronic lymphocytic leukemia (CLL) is a monoclonal B-cell disorder that derives from a mature lymphocyte that can differ in IGHV mutation status: IGHV -unmutated (U-CLL) and IGHV -mutated (M-CLL). Because in the vast majority of patients the IGHV mutation status of the clinically-appreciated clonal sequence remains the same over time when defined by Sanger sequencing, it has been assumed that IGHV-D-J intraclonal diversification does not occur. However, a few studies using Sanger sequencing have indicated that the leukemic clone is not monolithic and can display varying levels of intraclonal heterogeneity(Bagnara et al., 2006; Gurrieri et al., 2002; Volkheimer et al., 2007). Since IGHV-D-J intraclonal diversity is a function of the frequency of variants in relation to the main clone, we used a high-throughput deep sequencing technique to robustly assess intraclonal diversification and evolution of the CLL B lymphocyte. This approach also allowed us to identify and analyze the IGH repertoire of the remaining normal B lymphocytes in the blood expressing CD5.

From the PBMC of 37 CLL patients, CD5+ B cells were isolated and full IGHV-D-J sequencing performed from mRNA and sequenced with Illumina MiSeq. Unique Molecular Barcodes (UMI) were incorporated before PCR amplification to eliminate the effects of PCR errors, thereby providing more quantitative data, and rigorous error correction. The latter provides high-quality sequences that permit accurate studying of somatic mutations indicative of intraclonal diversification. Raw reads were processed with a custom pipeline built with pRESTO (Vander Heiden et al., 2014); IGHV, IGHD, and IGHJ gene attribution was performed with IMGT/HighV-QUEST; clonal families were identified with Change-O(Gupta et al., 2015); and data were analyzed with R custom scripts. To make description of findings clear, we refer to the IGHV-D-J rearrangement identified at diagnosis by Sanger sequencing as the "CLL clonotype" and the precise nucleotide sequence of this IGHV-D-J rearrangement as "CLL clone". Moreover IGHV-D-J rearrangements other than the leukemic one are referred to as "non-leukemic clonotypes".

We observed that CLL clonotypes studied ex vivo without any stimulation exhibit substantial intraclonal diversity of the leukemic IGHV-D-J rearrangement, defining ongoing somatic mutations. This was the case for U-CLL and M-CLL clones. Moreover 84% of M-CLL cases evolved from a precursor differing in IGHV mutational status. Another indication of clonal evolution was branching complexity of phylogenetic trees within the evolving clone. In ~40% of the studied cases, the subclonal variants exhibit a simple branching structure, while in the remaining ~60% a number of subclones form a complex branching structure. Although this intraclonal diversity was substantial, in most instances the CLL clone outstripped these considerably. However in ~30% of CLL cases, we identified at least 1 expanded subclonal variant of the CLL clone that represented ≥ 1% of the leukemia-related IGHV-D-J sequences. Notably, these CLL patients experienced the shortest times to first treatment (P <0.01).

Among the CD5+ B-cells with non-leukemic clonotypes, we identified a significant expansion of one or more clonotypes. In ~23% of CLL cases, the single most expanded non-CLL clonotype represented ≥1% of the total. Moreover, these could represent up to 40% of the total IGHV-D-J transcripts sequenced. If the CD5-derived, non-leukemic clonotypes represented ≥0.75% of the total transcripts, these patients associated with longer time to first treatment (P <0.02). Moreover Finally, the CD5-derived, non-leukemic clonotypes fell into a known CLL stereotyped IGHV-D-J sequence at a frequency 10 times higher than found in normal PBMCs.

In summary, we document that CLL clones diversify in vivo, accumulating IGHV-D-J mutations that generate clonal complexity and subclonal expansion not previously appreciated. This, occurring in both U-CLL and M-CLL, directly correlates with time to first treatment, and may serve as an indicator of DNA aberrations occurring genome-wide ("genetic instability") likely reflecting the action of the DNA mutator, activation-induced deaminase that can also lead to genomic changes outside V-gene loci. Finally, the common presence of secondary expanded clones, unrelated to the CLL clone, also has prognostic significance and may give insight into the development of CLL.


No relevant conflicts of interest to declare.

Author notes


Asterisk with author names denotes non-ASH members.