Low-burden TP53 mutations in CLL: clinical impact and clonal evolution within the context of different treatment options

Patients with chronic lymphocytic leukemia (CLL) with TP53 mutations with a >10% variant allele frequency (VAF) are often refractory to chemotherapy and benefit from targeted therapy. Malcikova and colleagues correlated TP53 mutations with <10% VAF with clinical outcomes. Low-level TP53 mutations also have a negative association with survival, with data suggesting that chemotherapy, but not targeted therapy, enables clonal expansion of TP53-mutant clones. If confirmed in prospective studies, this suggests that low-burden TP53-mutant CLL should be considered for up-front targeted therapy.


Pre-treatment cohort
The patient cohort included all CLL patients entering first-line treatment with chemo/immunotherapy between the years 2008-2017 with available material (n=513). Two patients receiving targeted therapy as first-line treatment were excluded from further analyses. The median time between sampling and treatment initiation was 5 days (range 0-1806; 91% of samples taken <6 months before treatment).
Follow-up period ended in February 2020. Median follow-up from treatment initiation was 3.8 years, for living patients 4.4 years. The study was primarily designed to assess differences in event-free survival (EFS) after treatment between cases harboring a wild-type TP53 gene and cases harboring low-burden TP53mut subclones. Based on our pilot study including patients treated between the years 2008-2010, the prevalence of low-burden TP53-mutated subclones in CLL was 16.9% (20/118). Assuming that small TP53 mutated subclones occur in at least 15% of the CLL patients and an EFS event occurs in 75% of all cases, we estimated that 500 patients would allow detecting at least a 50% difference in median survival time between patients harboring a wild-type TP53 gene and patients harboring low-burden TP53 mutated subclones with power 0.8 and two-sided alpha 0.05.
A sub-analysis was performed on diagnostic samples. In 168 patients (32.9%) the pre-therapy samples corresponded to a diagnostic sample as the therapy was initiated shortly after diagnosis. Out of the remaining patients with pre-therapy samples taken >6 months from diagnosis, the diagnostic samples were available in 94 patients. Median time between diagnosis and sampling was 1.3 months (range: 0-7.1). In these patients, time to first treatment (TTFT) was 26.5 months (range 7.9-141.6 months), which corresponded to TTFT of the whole cohort (median: 22.2; range: 0-246.6 months); however, due to unavailability of samples from patients diagnosed before 2003, it was significantly shorter comparing to patients with no diagnostic sample available (median 42.1, range: 7.1; range 246.5 months; p=0.0020).

Relapsed cohort treated with novel agents
The cohort included 170 relapsed-refractory (R/R) patients entering treatment with BcR or bcl2 inhibitors between 07/2013-07/2019. In all patients with available material, NGS for TP53 gene was performed Sample purity was assessed using flow cytometry measurements of CD5, CD19 and CD45 expression and exceeded 95% of CLL cells in 98% of tested samples.
In order to isolate residual malignant cells from the samples obtained during disease remission we employed immunomagnetic separation using Whole Blood Anti-ROR1 MicroBead Kit (Miltenyi Biotec) 1 .
In relapsed samples with low lymphocyte counts (<1x10 9 /l) and in samples obtained during targeted Genome-wide detection of copy number gains and losses, and copy neutral losses of heterozygosity (cnLOH) was carried out using CytoScan HD arrays (ThermoFisher Scientific). The limit of detection of the method was estimated to ~10% occurrence of aberration. Array results were evaluated following 3,4 . TP53 mutation analysis by deep next-generation sequencing TP53 mutation analysis was performed on 30ng of DNA isolated from purified B-lymphocytes using amplicon ultra-deep NGS of exons 2-11, including splicing sites (Supplementary Table 2 Overlapping paired-end reads were merged (with default settings, minimum score 20) and trimmed (with default settings, quality limit 0.05, number of nucleotides between 15 and 500). Read alignment was performed on the unmasked human reference genome build GRCh 37, patch release 9 with default parameters and similarity fraction of 0.8. Local realignment and quality-based variant calling were applied.
Variants below quality score 30 (Q<30) were filtered out. Further details regarding CLC settings are available upon request. ANNOVAR software was used to annotate variants with gene and exonic function, exon number, position in cDNA and amino acid change (RefSeq) 8 . The reference sequence used for annotation was NM_000546.5. Recurrent sequencing and alignment errors and common exon polymorphisms were removed from further analysis by manual curation.

Specificity and Sensitivity of the TP53 NGS Approach
Eight primary CLL samples containing eight TP53 variants (Supplementary Table 3) were mixed with DNA isolated from peripheral blood leukocytes of a young, healthy donor to reach VAF of ~1-5%. The prepared sample was serially diluted 1:1 four times into donor DNA, leading to 50%, 25%, 12.5% and 6.25% of the original mixture. Ten aliquots (replicates) from each dilution were used for library preparation and sequenced on MiSeq (MiSeq v2 300 cy). To reach sufficient coverage (≥10 000 per exonic and splice site bases), the libraries were sequenced in two runs: 1 st run -undiluted sample (100%), 25%, 12.5% and 6.25% dilutions; 2 nd run -50% dilution. No difference between the runs was observed.
The results were analyzed with the pipeline used routinely. If a variant was not observed above the 0.05% threshold, VAF was set to 0 as variants with VAF below 0.05% were not reported in the pipeline output file.
The median and 1% and 99% quantiles were calculated for each variant, and each dilution from the ten replicates and plotted (Supplementary Figure 1A). The median allelic frequency value, for which the 1% quantile curve fell below 0.1% VAF, was called. For all tested variants, this value was ≤0.3%.
Supplementary Figure 1B shows the variability within the groups of 10 replicates expressed as an interquartile range divided by median allelic frequency values. The variability increases with decreasing median allelic frequency values. It indicates that the allelic frequency of low-VAF variants was affected by 4 stochastic events and it makes it impossible to precisely assess the increase/decrease of VAF between two samplings only.
Further, we confirmed that with the approach we used, i.e. re-testing of an affected exon for variants below 1% VAF, we were able to avoid false positive results. For each dilution, we recorded variants ≥ 0.1% other than those listed in Supplementary Table 3. In total, three such variants were identified (c.1106T>C p.Leu369Pro 0.11%; c.19G>T p.Asp7Tyr 0.11%; c.626G>C p.Arg209Thr 0.11%). None of the variants was identified twice in the set of 10 replicates; leading to the conclusion that such false positive variants would be excluded by repeated analysis with a high probability.
Overall, it was not possible to assess a general detection limit, as this value is position-and variant-specific.
We presume that with the exception of several recurrent sequencing and alignment errors (e.g. indels not affecting common hotspot regions) with our methodical approach we were able avoid false positive and false negative results above ≈0.1% and ≈0.3% VAF for most variants, respectively.
Whole exome sequencing Raw sequencing data in FASTQ format were processed using the bcbio pipeline manager version 1.2.3. 9 .
The pipeline consists of read trimming, performed by the Atropos tool 10 , read alignment to the human reference genome GRCh38, performed with bwa mem 11 , samtools 12 and sambamba 13 and somatic variant calling performed by the mutect2 14 , strelka2 15 , and vardict 16 variant callers. The resulting variants were annotated using the VEP 17 annotation software version 100.2. The resulting annotated VCF files were converted to table format using an inhouse conversion script.
All detected somatic variants were manually filtered and inspected in the respective bam files using IGV 18 software. Using the k-means algorithm, mutations were clustered into groups based on their VAFs in given time points and the VAF differences between consecutive timepoints. The clonal composition and subclone proportions were visualized into ''fish plots'' with the fishplot R package 19 .

Statistical analyses
All statistical analyses were performed in R 20 . Values p<0.05 were considered statistically significant. The Cox model and logrank test were applied to assess differences in the survival of patients; survival curves were visualized by the Kaplan-Meier curves 21 . Event-free survival (EFS) probability was assessed from the time of treatment initiation to any of the following events: progression, therapy change, death of any cause. Overall survival (OS) was estimated from treatment initiation to death of any cause. The Benjamini-Hochberg correction was used for multiple comparison adjustments in pairwise comparisons of groups' survival 22 . The Fisher exact test was used to confirm the association between categorical variables. The Mann-Whitney test was used to confirm the association between categorical and continuous variables.
Odds ratios with 95% confidence bars were visualized by forest plots 23     CIT -chemoimmunotherapy; CR -complete response; PR -partial response, SD -stable disease, PDprogressive disease ‡ All groups were mutually compared. Only significant comparisons are shown.

Supplementary Figures
Supplementary Figure 1. Schematic visualization of the patient cohorts analyzed in the study.

Supplementary Figure 2. Sequencing of the dilution series. A) Dilution curves for individual tested variants.
For description of the experiment see Supplementary methods above. Blue line connects medians calculated from dilution series. Grey line -1% and 99% inter-quantiles. Red curve shows 0.1% VAF. The median allelic frequency value for which the 1% quantile curve fell below 0.1% is indicated in pink. B) The variability within the groups of 10 replicates expressed as interquartile range divided by median allelic frequency values. X-axis: median allelic frequency value (% VAF), y-axis: variability. Two groups of data with median VAF equal to zero and first quartile equal to zero, respectively, are not shown in the graph.   Frequency of mutations is depicted as % of all variants in the high-burden and low-burden subgroups, respectively. All comparisons were statistically evaluated and no significant differences were found. Figure 6. Event-free survival from rst line treatment initiation. A) Strati cation including del(17p). B) Strati ed according to the 10% VAF threshold. C) Strati ed according to the 1%, 5% and 10% VAF thresholds. D) In IGHV-mutated patients. Median EFS (y) 95% Cl

Supplementary Figure 7.
Overall survival from rst line treatment for A) the whole cohort strati ed according to the 10% VAF threshold, and B-E) a cohort of patients receiving targeted treatment or undergoing allo-HSCT in later stages of the disease excluded from the analysis: B) Strati cation including del(17p). C) Strati ed according to the 10% VAF threshold. D) Strati ed according to the 1%, 5% and 10% VAF thresholds. E) In IGHV-mutated patients.