Fusion genes (FGs) are major molecular biological abnormalities in acute leukemia and have been used as molecular markers for the diagnosis, classification, risk stratification and targeted therapy of leukemia. We previously reported common FGs were presented in approximately 29% of acute lymphoblastic leukemia (ALL) cases (Chen X et al., Leuk Res 2018). The rapid development of sequencing technology and the decline of sequencing costs in recent years have made whole transcriptome sequencing (WTS) more accessible, which can not only analyze known FGs, but also has unique advantages in identifying unknown rare and variant FGs. We aimed to identify novel fusion transcripts with clinical relevance and delineate a comprehensive map of FGs in ALL based on a large cohort using WTS.


We studied 350 consecutively diagnosed ALL patients using WTS, including 285 cases of B-ALL and 65 cases of T-ALL. 50 normal bone marrow (BM) samples from healthy donors were used as controls. Written informed consents were obtained from all patients and healthy donors or their guardians in accordance with the Declaration of Helsinki.

WTS was performed using RNA extracted from the BM samples by HiSeq 2500. Reads were mapped and processed by Arriba (v1.0.1) to generate gene fusions. Only in-frame fusions of high confidence were retained. We applied FGs to a four-tier system as follows; (A) pathogenic: well-known FGs or new members of common fusion gene families (FG-FMs) with definite pathogenicity in hematological malignancies or other tumors. (B) likely pathogenic: rarely reported FGs or new members of rare FG-FMs in hematological malignancies or other tumors without functional verification. (C) uncertain significance: novel FGs and both genes have not been reported in tumors. (D) non-pathogenic: FGs detected in normal samples.


Our analysis identified 309 high confidence in-frame FGs in 350 ALL cases. We further classified the FGs into four ties based on pathogenicity and the 198 tier A and 48 tier B FGs were adopted to the final FG list for further analysis (Figure 1).

The 246 tier A and tier B FGs were identified in 208 (59%) samples (mean, 1.2 per sample), of which 115 were distinct events. Tier A FGs were detected in 184 (53%) cases while 24 (7%) cases had tier B FGs without tier A FGs. The remaining 142 (41%) cases had no tier A nor tier B FGs. We identified 33 cases with co-existence of at least two different FGs, accounting for 9% of all cases enrolled in this study and 16% of all positive cases (Figure 2). Multiplex-nested RT-PCR which was designed to detect 41 common FGs (all belonged to tier A FGs) was performed in all 350 cases and only 106 (30%) cases were positive.

We found 22 kinds of recurrent FGs which occurred at least twice, including 20 tier A and 2 tier B FGs, respectively. The 2 tier B FGs (ZNF292-PNRC1 and C21orf33-ZADH2) was detected in 7 and 2 cases, respectively, and have never been previously reported. Furthermore, we classified the 115 distinct FGs found in the 208 cases according to FG-FMs, which referred to FGs that involve one protagonist gene and multiple fusion partners. More than half (53%) FGs could be classified into 17 FG-FMs, such as ABL1-FM, ETV6-FM, ZNF384-FM, KMT2A-FM, MEF2D-FM, PAX5-FG and TCF3-FM. The other 54 distinct FGs like P2RY8-CRLF2, CBFA2T3-SLC7A5 and C21orf33-ZADH2 could not be classified into any family. Most FGs (94%) which could not be clustered into FG-FMs occurred only once. All in all, 76% of the 246 tier A and tier B FGs could be classified into FG-FMs, the remaining 24% FGs mainly belonged to tier B and rarely recurred in different samples. When we focused on tier A FGs, 95% of them could be clustered into FG-FMs, while only 5% of them could not be classified into any FG-FM.


We described the map of FGs detected in a large cohort of ALL and revealed FGs with clinical relevance that have not been previously recognized. Classifying FGs according to FG-FMs can better understand their pathological significance and suggest new classification patterns of acute leukemia. WTS is a valuable tool and should be widely used in the routine diagnostic workup of ALL.


No relevant conflicts of interest to declare.

Author notes


Asterisk with author names denotes non-ASH members.