Intensive efforts of genome sequencing studies during the past decade identified >100 driver genes recurrently mutated in one or more subtypes of myeloid neoplasms, which collectively account for the pathogenesis of >90% of the cases. However, approximately 10% of the cases have no alterations in known drivers and their pathogenesis is still unclear. A possible explanation might be the presence of alterations in non-coding regions that are not detected by conventional exome/panel sequencing; mutations and complex structural variations (SVs) affecting these regions have been shown to deregulate expression of relevant genes in a variety of solid cancers. Unfortunately, however, no large studies have ever been performed, in which a large cohort of myeloid malignancies were analyzed using whole genome sequencing (WGS) in an attempt to identify a full spectrum of non-coding alterations, even though its efficacy have been demonstrated in many solid cancers. In this study, we performed WGS in a large cohort of pan-myeloid cancers, in which both coding and non-coding lesions were comprehensively analyzed.

Patients and methods

A total of 338 cases of myeloid malignancies, including 212 with MDS, 70 with AML, 17 with MDS/MPN, 23 with t-AML/MDS, and 16 with MPN were analyzed with WGS, of which 173 were also analyzed by transcriptome sequencing. Tumor samples were obtained from patients' bone marrow (N=269) or peripheral blood (N=69), while normal controls were derived from buccal smear (N=263) or peripheral T cells (N=75). Sequencing of target panel of 86 genes were performed for all samples. Sequencing data were processed using in-house pipelines, which were optimized for detection of complex structural variations (SVs) and abnormalities in non-coding sequences.


WGS identified a median of 586,612 single nucleotide variants (SNVs) and 124,863 short indels per genome. NMF-based decomposition of the variants disclosed three major mutational signatures, which were characterized by age-related C>T transitions at CpG sites (Sig. A), C>T transitions at CpT sites (Sig. B), and T>C transitions at ApTpN context (Sig. C). Among these, Sig. C showed a prominent strand bias and corresponds to COSMIC signature 16, which has recently been implicated in alcohol drinking. Significant clustering of SNVs and short indels were interrogated across the genome divided into different window sizes (1Kbp, 10Kbp, 100Kbp) or confining the targets to coding exons and known regulatory regions, such as promoters, enhancers/super enhances, and DNase I hypersensitive sites. Recapitulating previous findings, SNVs in the coding exons were significantly enriched in known drivers, including TP53, TET2, ASXL1, DNMT3A, SF3B1, RUNX1, EZH2, and STAG2. We detected significant enrichment of SNVs in CpG islands, and promoters/enhancers. We also detected a total of 8,242 SVs with a median of 15 SVs/sample, which is more prevalent than expected from conventional karyotype analysis. Focal clusters of complex rearrangements compatible with chromothripsis were found in 8 cases, of which 7 carried biallelic TP53 alterations. NMF-based signature analysis of SVs revealed that large (>1Mb) deletions, inversions, and tandem duplications and translocations are clustered together and were strongly associated with TP53 mutations, while smaller deletions and tandem duplications, but not inversions, constitute another cluster. As expected, FLT3-ITD (N=15) and MLL-PTD (N=12) were among the most frequent SVs. Unexpectedly, in addition to known SVs associated with t(8;21) (RUNX1-RUNX1T1) (N=6) and t(3;21) (RUNX1-MECOM) (n=1) as well as non-synonymous SNVs within the coding exons (N=30), we detected frequent non-coding alterations affecting RUNX1, including SVs (N=15) and SNVs around splicing acceptor sites (N=5), suggesting that RUNX1 was affected by multiple mechanism, where as many as 38% of RUNX1 lesions were explained by non-coding alterations. Other recurrent targets of non-coding lesions included ASXL1, NF1, and ETV6.


WGS was successfully used to reveal a comprehensive registry of genetic alterations in pan-myeloid cancers. Non-coding alterations affecting known driver genes were more common than expected, suggesting the importance of detecting non-coding abnormalities in diagnostic sequencing.


Nakagawa:Sumitomo Dainippon Pharma Co., Ltd.: Research Funding. Usuki:Mochida Pharmaceutical: Speakers Bureau; Astellas Pharma Inc.: Research Funding; Sanofi K.K.: Research Funding; GlaxoSmithKline K.K.: Research Funding; Otsuka Pharmaceutical Co., Ltd.: Research Funding; Kyowa Hakko Kirin Co., Ltd.: Research Funding; Daiichi Sankyo: Research Funding; Celgene Corporation: Research Funding, Speakers Bureau; SymBio Pharmaceuticals Limited.: Research Funding; Shire Japan: Research Funding; Janssen Pharmaceutical K.K: Research Funding; Boehringer-Ingelheim Japan: Research Funding; Sumitomo Dainippon Pharma: Research Funding, Speakers Bureau; Pfizer Japan: Research Funding, Speakers Bureau; Novartis: Speakers Bureau; Nippon Shinyaku: Speakers Bureau; Chugai Pharmaceutical: Speakers Bureau; Takeda Pharmaceutical: Speakers Bureau; Ono Pharmaceutical: Speakers Bureau; MSD K.K.: Speakers Bureau. Chiba:Bristol Myers Squibb, Astellas Pharma, Kyowa Hakko Kirin: Research Funding. Miyawaki:Otsuka Pharmaceutical Co., Ltd.: Consultancy; Novartis Pharma KK: Consultancy; Astellas Pharma Inc.: Consultancy.

Author notes


Asterisk with author names denotes non-ASH members.