Select tumor tag sequences, informed by the probability of generating the same sequence in an independent recombination event, are likely to be useful in tracking mature B cell malignancies where the IGH locus has been deleted. We considered the use of IGK and IGL for this purpose, but found that 2-4% of these rearrangements were shared between different individuals. The relatively high frequencies of such public rearrangements at IGK and IGL prompted us to search for other forms of sequence diversity that could be used as private clonotypic tags. Somatic hypermutations (SHM) may serve such a function. Our analysis of IGK and IGL suggests that tumor tracking sequences for detecting minimal residual disease should be selected with care, and these loci may be best suited for lymphoid malignancies that are characterized by high levels of SHM.

Tracking minimal residual disease for B cell malignancies is an established technology, traditionally using either flow cytometry or a custom quantitative PCR assay for each patient. Recent technical developments in the massively parallel sequencing of somatically rearranged IG loci allow for a standard assay to be applied to screen for residual tumor burden in all patients, by first identifying the clonal IG rearrangements tagging the tumor in an index sample taken from the patient during active disease, and then screening for these tumor tagging sequences in follow up samples. A crucial assumption in these tagging strategies is that the tumor tagging sequences are idiosyncratic to the tumor, and unlikely to be generated independently in a recurrent rearrangement.

In order to screen for recurrent sequences between two healthy individuals, we generated IG heavy and light chain libraries from 100,000 antigen experienced B cells (CD19+CD27+) isolated from whole blood by FACS. 130 bp reads were collected, starting within the J segment and extending across the CDR3 into the V segment. Unique sequences were compared between individuals to assess the frequency of nucleotide identical, “public” rearrangements shared between individuals. Less than 0.01% of unique IGH sequences overlapped between individuals, so the risk of a false positive MRD result from recurrent recombination at IGH is minimal. However, 4.3% and 1.9% of unique sequences at IGK and IGL, respectively, were shared between individuals. The shared sequences had significantly higher average copy numbers than unshared sequences, accounting for 20% of total sequences at IGK and 12% of total sequences at IGL. These data suggest that B cells carrying public sequences undergo higher levels of clonal expansion, and/or they are recurrently produced.

Public sequences carried by B cell malignancies are likely to be of limited utility as tumor-tagging sequences, as it may be impossible to distinguish between low-level residual disease and benign, recurrent rearrangements in the patient. Therefore we assessed if we could predict whether a given sequence in the memory repertoire would be public using solely information derived from that sequence. We used logistic regression to screen for variables to predict the likelihood of a given sequence to be public, and identified a number of expected variables as significant predictors, including the identity of the V and J segments, the length of the non-templated insertion at the junction, and the number of somatic hypermutations within the V or J segment. By far the most important of these factors was the number of SHM events in the clone; consequently, the most useful light chains for tumor tracking will be those with significant SHM.

We continue to explore factors contributing to the public IG repertoire. Particularly at IGK, there is an unexpectedly narrow range of CDR3 lengths, and we are determining if this might be attributable to low diversity in the primary repertoire, or due to positive selection in favor of this length in the mature naïve or mature repertoires. In conclusion, a high frequency of public IG light chain sequences in the antigen- experienced peripheral B cell repertoire suggests that naïve application of light chain clones for tracking MRD can generate false positive results, but that careful selection of tumor tracking sequences with SHMs can minimize this risk.


Carlson:Adaptive Biotechnologies: Consultancy, Equity Ownership, Patents & Royalties. Howie:Adaptive Biotechnologies: Employment, Equity Ownership.

Author notes


Asterisk with author names denotes non-ASH members.