Chronic Lymphocytic Leukemia (CLL) is characterised by a highly heterogeneous natural history and treatment response. Indeed, 50% of immunoglobulin heavy chain variable region (IgHV) hypermutated patients have an excellent progression free survival (PFS) after chemoimmunotherapy. Conversely, 25% of FCR treated patients relapse within 24 months (high risk CLL). Recent studies have shown that complex karyotype with or without TP53 disruption predicts for relapse after BCL2 therapy and BTK inhibitors. However, TP53 is the only marker for which routine testing is available. Overall, nearly 80% of patients relapsing after frontline FCR do not present a known poor risk genomic marker. Additional candidate genomic predictors of poor outcome including mutations in coding regions of NOTCH1, SF3B1 and RPS15, non-coding regions of NOTCH1 and enhancer regions of PAX5, telomere length, IgHV status, and DNA Damage Repair (DDR) germline mutations including TP53 and ATM have been reported in CLL. Further, the role of mutational signatures and regions of kataegis also merit additional investigation in progressive CLL. Evaluating all candidate predictors requires complex time consuming, multi-modality testing outside the scope of routine clinical diagnostic practice, however, in isolation, each has low predictive value. Here, we show preliminary data on a novel patient stratification method based on whole genome sequencing (WGS) data incorporating multiple genomic features in a single test.

Patients and Methods

Tumor (peripheral blood) and germline (saliva) samples were collected from 321 patients from 6 UK trials via the Genomics England CLL pilot: ARCTIC (n=61), AdMIRe (n=64), CLL 210 (n=30), CLEAR (n=12), RIAltO (n=88) and FLAIR (n=66). We performed WGS on the HiSeqX (Illumina). After read alignment, we detected somatic variants using Strelka 2.4.7 for small variants detection (SNV and InDels), Manta 0.28.0 for structural variant (SV) detection, and Canvas 1.3.1 for copy number variant (CNV) detection (Illumina). Non-coding regions were annotated with information from primary CLL, CLL cell lines and B-cell ENCODE databases. Mutational signatures and putative regions of kataegis were calculated based on Alexandrov et al. (Nature, 2013) and Lawrence et al. (Nature, 2013). Telomere lengths were assessed using Telomerecat. Data aggregation was performed using contingency tables combined with non-negative matrix factorization.


Mean coverage was 94.2X for tumor and 28.5X for germline samples. We found a median of 9172 SNPs/sample after filtering and 2348 indels/sample across 321 patients. High risk CLL was enriched for genomic complexity and poor prognostic mutations. The most frequently mutated genes were SF3B1 (17%), TP53 (13%), NOTCH1 (12%), IGLL5 (12%), and ATM (11%). Analysis of non-coding regions using DNA methylation markers, ATAC-seq and Hi-C revealed potential candidate regions associated with early relapse. Using CNA and SV data, we identified interesting patterns of genomic complexity and structural variants, including a trend towards enrichment of del8p in Relapse/Refractory and FCR non-responders. Additionally, we investigated mutation signatures and kataegis across coding and non-coding regions of the genome. We correlated exonic regions of DDR genes in germline data with clinical outcomes and extended this to genes mutated in both tumor and germline data, termed germline-tumor double-hits.

We examined the relationship between the Alexandrov hypermutation signature, IgHV status (determined by % homology to the reference genome) and PFS, and combined mutational density at the Ig locus with mutation signature aiming to predict IgHV status.

Finally, we produced a binary contingency matrix, using non-negative matrix factorization to cluster the samples. This method highlighted patient groups with shared genomic profiles.


We present preliminary data on a patient stratification method derived from WGS of 321 paired germline and CLL trial samples. Our predictive signature includes driver gene mutations, CNAs, IgHV status, genomic complexity, telomere length, overall mutation burden and genes with germline-tumor double-hits. Our comprehensive, NGS-based patient stratification attempts to predict patient outcome in a single sequencing run.


Becq:Illumina: Employment. He:Illumina: Employment. Ross:Illumina: Employment. Bentley:Illumina: Employment. Pettitt:Celgene: Research Funding; Gilead: Research Funding; Roche: Research Funding; GSK/Novartis: Research Funding; Napp: Research Funding; AstraZeneca: Research Funding; Chugai: Research Funding. Hillmen:Novartis: Research Funding; Gilead Sciences, Inc.: Honoraria, Research Funding; Alexion Pharmaceuticals, Inc: Consultancy, Honoraria; F. Hoffmann-La Roche Ltd: Research Funding; Celgene: Research Funding; Acerta: Membership on an entity's Board of Directors or advisory committees; Abbvie: Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding; Pharmacyclics: Research Funding; Janssen: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding. Schuh:Giles, Roche, Janssen, AbbVie: Honoraria.

Author notes


Asterisk with author names denotes non-ASH members.