AML can develop after an anteccedent myeloid malignancy (secondary AML [sAML]) or arise de novo (pAML). Genome sequencing technologies have illuminated the complexity of genomic abnormalities that drive AML and contribute to its phenotype. Since identifying a single gene or co-mutated genes unlikely to yield an understanding on how these mutations define disease biology and phenotype, an unbiased approach is needed to study the relationship of those abnormalities to each other and to AML biology.

Here, we study these associations using an unbiased approach analogous to Netflix or Amazon's recommender system in which customer who buys A, and B is likely to buy C (the occurrence of mutation A, B, and C is likely to be associated with pAML or sAML).

We performed targeted mutational analysis of 468 patients with rigorously defined pAML and sAML. The association between mutations and disease phenotype was investigated by Apriori market basket analysis algorithm. Association rule is a machine learning method that can identify relationships between variables in a large dataset. Clonal architecture of driver vs. subclonal mutations was evaluated by using allele fractions of point mutations in samples with 2 or more mutations and statistically significant clonal heterogeneity.

Of 468 patients (pts) were included in the final analysis, 247 had pAML and 221 sAML. The median age for the entire cohort was 64 years (range, 18-100) and 222 pts (47.4%) had normal karyotype (NK). Compared to pts with pAML, those with sAML were older (68 vs 60 years, P <.001), had a lower WBC at presentation (3.85 vs 13 ×109 g/l, P <.001), less likely to have NK cytogenetics (35.7 vs 57.9%, P <.001), and more likely to have unfavorable cytogenetics (36.7 vs 23.9%, P=.002). The most commonly mutated genes in pAML included: NPM1 (27%), DNMT3A (25%), TET2 (25%), FLT3 (ITD/TKD, 21%), and RUNX1 (10%) and in sAML: TET2 (17%), ASXL1 (16%), SRSF2 (11%), U2AF1 (11%), and DNMT3A (10%). Association rules identified mutations in NPM1, FLT3, and DNMT3A as commonly associated with each other predominantly in pAML. DNMT3A was associated with NPM1 in 30 pts (12%) with pAML (21% with NK), with FLT3 in 10% of pAML (17% of NK and 16% with unfavorable karyotype), and without NPM1 or FLT3 in 25 pts (10%). DNMT3A co-occurred with NPM1 and FLT3 in 16 pts (6%) of pAML (11% of NK). Among pts with DNMT3AWT/ NPM1WT/ FLT3WT (triple negative), RUNX1 was the most commonly mutated gene in 18 pts (7%) followed by TET2 13 pts (5%). Clonal architecture analysis showed that DNMT3A commonly occurred as a driver mutation while NPM1 +/- FLT3 were subclonal. In pts with triple negative disease, RUNX1 or TET2 were commonly found as driver mutations.

Association rules identified the combination of ASXL1 and TET2 along with one of spliceosome mutations (SRSF2, U2AF1, or ZRSR2) as highly specific combination for the sAML phenotype (6% of pts with sAML). Among pts with sAML who does not have any of these mutations, TP53 occurred in 16 pts (7%), DNMT3A in 14 (6%) (Commonly with NPM1WT and FLT3WT), and NRAS in 11 pts (5%). In sAML, 22 pts had TP53, 18 of which (82%) were associated with unfavorable karyotype. Clonal architecture analysis showed that ASXL1 and TET2 were commonly co-mutated as driver clone along with the spliceosome mutations.

Association rules can define combinations of genomic abnormalities that can define AML phenotype. This study also show that defining AML phenotype by one gene or two gene mutations undermine the complexity of genomic abnormalities in AML. Unbiased genomic combinations using the recommender system approach may help to understand the complexity of genomic information in AML.


Sekeres: Celgene: Membership on an entity's Board of Directors or advisory committees. Advani: Pfizer: Consultancy; Takeda/ Millenium: Research Funding. Gerds: CTI BioPharma: Consultancy; Incyte: Consultancy.

Author notes


Asterisk with author names denotes non-ASH members.