Background: Diffuse large B-cell lymphoma (DLBCL) is the most common form of adult lymphoma. Although comprehensively investigated, it remains a prominently heterogeneous disease. The standard of care for DLBCL is a combination chemo-immunotherapy RCHOP regimen. The 3-year progression free survival (PFS) rate is 70%. Relapses normally occur within 2-3 years and treatment options for relapsed and refractory DLBCL are limited with only 10% of patients achieving a 3-year PFS. Global mRNA expression analysis has been used to further classify DLBCL into molecularly distinct subtypes (GCB and ABC) which display remarkably different outcomes to current therapeutics, with GCB having a better prognostic outcome. Multiple attempts at further restratification of DLBCL cases based on next generation sequencing have only confirmed the striking heterogeneity of the disease; providing little direction for clinical practices. Therefore, further analysis of DLBCL with the intent to stratify patients for clinical response is needed.

We utilized whole exome sequencing to reveal alterations in the genome that are associated with relapse and chemotherapeutic resistance in DLBCL. Current literature has focused on DLBCL as an entity without stratification into relapsed/refractory and de novo DLBCL. In addition, published studies have had a clear focus on diagnostic sampling. By discovering and prioritizing somatic mutations associated with relapse and chemotherapeutic resistance, we aim to provide a novel analysis of the mutational heterogeneity of DLBCL to identify new biomarkers for improved stratification of patients for clinical treatment.

Method: DLBCL samples from 37 cases (45 biopsies) were collected with normal matched blood (germline). Of these, 17 cases never relapsed (median observation time 3.9 yrs). Whereas, 20 cases relapsed from RCHOP or related therapy (median observation time 4.8 yrs). Of these, 15 biopsies were taken at diagnosis and 13 were taken at relapse including 8 cases with serial biopsies. We performed whole exome capture and paired-end sequencing using the Illumina HiSeq2000 platform. Reads of each sample were mapped with BWA-mem to the human reference genome (build b37 with an added decoy contig). Marking of duplicates was performed with Picard tools; GATK tools were used for two‐step local realignment and base‐quality recalibration. Somatic single nucleotide variant (SNV) detection was performed with MuTect and Strelka. Strelka alone was used for insertion/deletion (InDel) detection.

Results: We achieved 288X mean coverage of targeted exonic regions in tumor samples and 125X in normal blood. After application of our bioinformatics pipeline, we identified 44,297 variants in the cohort. The mean mutational SNV burden was 944 and the mean InDel burden was 40 variants per case. In serial biopsies there was a higher mean mutational burden at relapse than diagnosis (1001 to 772, respectively). After filtering synonymous, UTR, non-coding, and low confidence variant calls, we identified 11,743 coding variants corresponding to 6,399 genes (Figure 1). In order to define potential recurrent driver mutations in DLBCL associated with relapse and survival, we applied the Oncodrive algorithm (IntOgen) to the total cohort and subgroups classified by relapse status, survival and subtype. We also applied the Oncocluster algorithm (IntOgen) to the total cohort to identify genes with significant hotspots and clustering which may suggest potential cancer driver mutations. Using a corrected q-value of <.1 we identified 102 potential driver mutations. Within our driver mutations we identified many of the previously discovered genes associated with DLBCL including; MLL2, B2M, CD58, MEF2B, FOXO1, TP53, PIM1, SOCS1, MYC, GNA13, SGK1, TNFAIP3, MYD88, PRDM1, CDKN2A, EZH2 and CIITA. However, as previously reported, our gene list does not extensively overlap with earlier DLBCL exome sequencing reports (Figure 2). This may, in part, be due to differing library preparations, sequencing methods, bioinformatic pipelines and criteria for gene selection, yet it also represents the heterogeneity of DLBCL as a neoplastic entity. Given the underlying heterogeneity of DLBCL, by focusing on relapse and survival status of our cohort the gene variants generated and analyzed in our analysis will provide a novel view of the DLBCL mutational landscape from the perspective of relapse and chemoresistance.


No relevant conflicts of interest to declare.

Author notes


Asterisk with author names denotes non-ASH members.