Abstract

To determine whether gene expression profiling could improve outcome prediction in children with acute lymphoblastic leukemia (ALL) at high risk for relapse, we profiled pretreatment leukemic cells in 207 uniformly treated children with high-risk B-precursor ALL. A 38-gene expression classifier predictive of relapse-free survival (RFS) could distinguish 2 groups with differing relapse risks: low (4-year RFS, 81%, n = 109) versus high (4-year RFS, 50%, n = 98; P < .001). In multivariate analysis, the gene expression classifier (P = .001) and flow cytometric measures of minimal residual disease (MRD; P = .001) each provided independent prognostic information. Together, they could be used to classify children with high-risk ALL into low- (87% RFS), intermediate- (62% RFS), or high- (29% RFS) risk groups (P < .001). A 21-gene expression classifier predictive of end-induction MRD effectively substituted for flow MRD, yielding a combined classifier that could distinguish these 3 risk groups at diagnosis (P < .001). These classifiers were further validated on an independent high-risk ALL cohort (P = .006) and retainedindependent prognostic significance (P < .001) in the presence of other recently described poor prognostic factors (IKAROS/IKZF1 deletions, JAK mutations, and kinase expression signatures). Thus, gene expression classifiers improve ALL risk classification and allow prospective identification of children who respond or fail current treatment regimens. These trials were registered at http://clinicaltrials.gov under NCT00005603.

Introduction

Through the optimization and progressive intensification of standard chemotherapeutic regimens, remarkable advances have been achieved in the treatment of pediatric acute lymphoblastic leukemia (ALL).1-3  In parallel, laboratory investigations have provided remarkable insights into the biologic and genetic heterogeneity of this disease with the characterization of several recurring genetic abnormalities (hyperdiploidy, hypodiploidy, t[12;21][ETV6-RUNX1], t[1;19][TCF3-PBX1], t[9;22][BCR-ABL1], and translocations involving 11q23[MLL]) that are associated with distinct therapeutic outcomes and clinical phenotypes.2  Detailed risk classification schemes, incorporating pretreatment clinical characteristics (such as age, sex, and presenting white blood cell [WBC] count), the presence or absence of recurring cytogenetic abnormalities, and measures of minimal residual disease (MRD) at the end of induction therapy, are now used to tailor the intensity of therapy to a child's relative relapse risk (categorized as low, standard/intermediate, high, or very high).4-6  Yet, despite refinements in risk classification and improvements in overall survival, the second most common cause of cancer-related mortality in children in the United States remains relapsed ALL.7  Whereas relapses are more frequent in children with very high-risk disease, associated with BCR-ABL1 or hypodiploidy, relapses occur within all currently defined risk groups.1,7  Indeed, the majority of relapses occur in children initially assigned to the standard/intermediate- or high-risk categories.7  Thus, a primary challenge in pediatric ALL is to prospectively identify those children with higher-risk disease who do not benefit from therapeutic intensification and who require the development of new therapies for cure.7 

In this study, we determined whether gene expression profiling could be used to improve risk classification and outcome prediction in high-risk pediatric ALL, a risk category largely defined by pretreatment clinical characteristics (age > 10 years and presenting WBC > 50 000/μL) and the absence of genetic abnormalities associated with low (hyperdiploidy, t[12;21][ETV6-RUNX1]) or very high (hypodiploidy, t[9;22][BCR-ABL1]) risk disease.4  More than 25% of children diagnosed with ALL are initially classified as high-risk. Outcomes in this form of ALL remain poor with high rates of relapse and relapse-free survival (RFS) of only 45% to 60%.7  Furthermore, the underlying genetic features associated with this form of ALL have not been well characterized. Thus, gene expression profiling and other comprehensive genomic technologies, such as assessment of genome copy number abnormalities or DNA sequencing, have the potential to resolve the underlying genetic heterogeneity of this form of ALL and to capture genetic differences that impact treatment response that can be exploited for improved risk classification and the identification of novel therapeutic targets.8-15 

From the gene expression profiles obtained in the pretreatment leukemic cells of 207 uniformly treated children with high-risk ALL, we used supervised learning algorithms and extensive cross-validation techniques to build a 42-probe-set (38-gene) expression classifier predictive of RFS. In multivariate analysis, the best predictive model for RFS was this gene expression classifier combined with either flow cytometric measures of MRD determined at the end of induction therapy (day 29), or, a 23-probe-set (21-gene) molecular classifier derived from pretreatment samples that could predict levels of end-induction flow MRD at initial diagnosis. The application of these classifiers separated children with high-risk ALL into 3 distinct risk groups with significantly different survivals in the initial patient cohort used for modeling and in a second independent cohort of high-risk ALL patients used for validation. The gene expression classifier for RFS alone and combined with flow MRD also retained independent prognostic significance in the presence of other genetic abnormalities (IKAROS/IKZF1 deletions,16 JAK mutations,17  and gene expression signatures reflective of activated tyrosine kinases16,18 ) that we and others have recently discovered and determined to be associated with a poor outcome in pediatric ALL. Thus, gene expression classifiers significantly enhance outcome prediction and risk classification in high-risk ALL and, in particular, identify a group of children most likely to fail current therapeutic approaches and for whom novel therapies must be developed for cure.

Methods

Patient selection

Patient samples and clinical and outcome data for this study were obtained from the Children's Oncology Group (COG) Clinical Trial P9906. COG P9906 enrolled 272 eligible high-risk B-precursor ALL patients between March 15, 2000 and April 25, 2003; all patients were treated uniformly with a modified augmented Berlin-Frankfurt-Münster Study Group (BFM) regimen.6,19  This trial targeted a subset of newly diagnosed high-risk ALL patients who had experienced a poor outcome (44% RFS at 4 years) in prior studies.5,20  Patients with central nervous system disease or testicular leukemia were eligible for the trial regardless of age or WBC count at diagnosis. Patients with very high-risk features (BCR-ABL1 or hypodiploidy) were excluded, whereas those with low-risk features (trisomies of chromosomes 4 or 10; t[12;21][ETV6-RUNX1]) were excluded unless they had central nervous system disease or testicular leukemia. The majority of patients had MRD assessed by flow cytometry, as previously described; cases were defined as MRD positive or MRD negative at the end of induction therapy (day 29) using a threshold of 0.01%.6  For this study, previously cryopreserved residual pretreatment leukemia specimens were available on a representative cohort of 207 of the 272 (76%) registered patients. With the exception of differences in presenting WBC count, these 207 patients were highly similar in all other clinical and outcome parameters to all 272 patients accrued to this trial (see supplemental Table 1, available on the Blood website; see the Supplemental Materials link at the top of the online article). For validation of the performance of the classifiers, an independent set of 84 children with high-risk ALL, previously treated on COG Trial 1961, was used as a validation cohort14  (supplemental Section 2 provides detailed patient characteristics of the validation cohort). Treatment protocols were approved by the National Cancer Institute and participating institutions through their institutional review boards. Informed consent for clinical trial registration, sample submission, and participation in these research studies was obtained from all patients or their guardians in accordance with the Declaration of Helsinki.

Microarray analyses

RNA was purified from 207 pretreatment diagnostic samples with more than 80% blasts (131 bone marrow, 76 peripheral blood) and hybridized to HG_U133A_Plus2.0 oligonucleotide microarrays (Affymetrix) after RNA quantification, cDNA preparation, and labeling (supplemental Section 3). Signals were scanned (Affymetrix GeneChip Scanner) and analyzed with Affymetrix Microarray Suite (MAS 5.0). The expression signal matrix used for outcome analyses corresponded to a filtered list of 23 775 probe sets (supplemental Section 4). This gene expression dataset may be accessed via the National Cancer Institute caArray site (https://array.nci.nih.gov/caarray/) or at Gene Expression Omnibus (http://www.ncbi.nih.gov/geo) under accession number GSE11877.

Statistical analyses

RFS was calculated from the date of trial enrollment to either the date of first event (relapse) or last follow-up. Patients in clinical remission, or with a second malignancy, or with a toxic death as a first event were censored at the date of last contact. As described in detail in supplemental Sections 4C and 5 to 9, a Cox score was used to rank genes based on their association with RFS, and a Cox proportional hazards model–based supervised principal components analysis21  was used to build the gene expression classifier for RFS from the rank-ordered gene list. Similarly, for the development of the gene expression classifier predictive of end-induction MRD, a modified t test was used to rank genes expressed in pretreatment cells according to their association with day 29 flow MRD, defined as positive or negative at a threshold of 0.01%.6  Diagonal linear discriminant analysis22,23  was then used to build a prediction model and the classifier for MRD from the top-ranked genes. The likelihood-ratio test (LRT) score and the prediction error rate were used in the model construction and evaluation. To avoid overfitting, extensive cross-validation was used to determine the numbers of top-ranked genes to be included.23  Nested cross-validations provided predictions for individual cases as well as overall measures of the selected models' performance.22,23 

For the first multivariate analysis testing, the predictive power of the gene expression classifier for RFS relative to flow cytometric measures of MRD and to other clinical and genetic variables, a multivariate proportional Cox hazards regression analysis was performed with the risk score (determined by gene expression classifier for RFS), WBC (on a log scale), and flow cytometric measures of MRD as explanatory variables. The LRT was performed to determine whether the risk score defined by the gene expression classifier for RFS was a significant predictor of time to relapse, adjusting for WBC and MRD.

To determine whether the gene expression classifier for RFS and the combined classifier (with flow cytometric measures of MRD) retained prognostic importance in the presence of new ALL-associated genetic abnormalities associated with a poor outcome that we and others have recently described, we accessed our recently published data reporting IKZF1/IKAROS deletions16  and JAK mutations17  in ALL, as these studies were performed using DNA samples from the same cohort of patients with high-risk ALL (COG P9906) reported in this study. The primary DNA copy number variation data reporting IKZF1 deletions16  may be accessed at http://target.cancer.gov/data. The JAK mutation data17  may be accessed at http://www.pnas.org/content/suppl/2009/05/22/0811761106.DCSupplemental/0811761106SI.pdf. A multivariate Cox proportional hazards regression analysis was performed with each expression classifier and included IKZF1/IKAROS deletions, JAK mutations, and kinase gene expression signatures as additional explanatory variables. A LRT was then performed to determine whether the classifiers retained independent prognostic significance adjusting for the effects of all covariates. All statistical analyses used Stata Version 9 and R.

Results

Patients and clinical risk factors

The median age of the 207 high-risk B-precursor ALL patients registered to COG Trial P9906 was 13 years (range: 1-20 years; Table 1). Whereas 23 of the 207 ALL patients had a t(1;19)(TCF3-PBX1) and 21 had various translocations involving MLL, the remaining 163 high-risk cases had no other known recurring cytogenetic abnormalities (Table 1). RFS in these 207 patients was 66.3% at 4 years (95% CI, 59%-73%; Figure 1A). Day 29 MRD, measured using flow cytometric techniques (end-induction flow MRD), was detected in 35% (67 of 191) of cases (Table 1).6  Among pretreatment clinical variables (age, sex, and central nervous system involvement), the presence of recurrent cytogenetic abnormalities (TCF3-PBX1 and MLL), and measures of MRD, only end-induction flow MRD and increasing WBC count were significantly associated with decreased RFS, and both retained significance in multivariate analysis (LRT based on Cox regression, P < .001; Table 1). A trend toward declining RFS was also observed among the 25% of children with Hispanic/Latino ethnicity (P = .049; Table 1).

Table 1

Association of RFS with clinical and genetic features in the high-risk ALL cohort

CharacteristicValueAssociation with RFS*
Hazard ratioP
Age, y    
    10 or older 132  
    Younger than 10 75 1.152 .561 
    Median 13   
    Range 1-20 .995 .817 
Sex    
    Male 137  
    Female 70 0.769 .320 
WBC, K/μL    
    Median 62.3   
    Range 1-959 1.003 < .001 
MRD at day 29    
    Negative 124  
    Positive 67 2.805 < .001 
Race    
    Hispanic or Latino 51 1.644 .049 
    Other 156  
MLL    
    Positive 21 1.061 .881 
    Negative 186  
E2A/PBX1    
    Positive 23 .704 .409 
    Negative 184  
CNS    
    No blasts 160  
    Fewer than 5 blasts 26 1.078 .826 
    At least 5 blasts 21 0.670 .392 
CharacteristicValueAssociation with RFS*
Hazard ratioP
Age, y    
    10 or older 132  
    Younger than 10 75 1.152 .561 
    Median 13   
    Range 1-20 .995 .817 
Sex    
    Male 137  
    Female 70 0.769 .320 
WBC, K/μL    
    Median 62.3   
    Range 1-959 1.003 < .001 
MRD at day 29    
    Negative 124  
    Positive 67 2.805 < .001 
Race    
    Hispanic or Latino 51 1.644 .049 
    Other 156  
MLL    
    Positive 21 1.061 .881 
    Negative 186  
E2A/PBX1    
    Positive 23 .704 .409 
    Negative 184  
CNS    
    No blasts 160  
    Fewer than 5 blasts 26 1.078 .826 
    At least 5 blasts 21 0.670 .392 

CNS indicates central nervous system.

*

Hazard ratio and corresponding P value are based on Cox regression.

Only 191 of 207 patients in the high-risk ALL cohort had flow MRD results at end induction.

Figure 1

Performance of the 42-probe-set (38-gene) gene expression classifier for prediction of RFS. (A-B) Kaplan-Meier survival estimates of RFS in the full cohort of 207 patients (A) and in the low- versus high-risk groups distinguished with the gene expression classifier for RFS (B). HR is the hazard ratio estimated using Cox regression. (C) A gene expression heatmap is shown with the rows representing the 42 probe sets (containing 38 unique genes) composing the gene expression classifier for RFS. The columns represent patient samples sorted from left to right by time to relapse or last follow-up. Red indicates high expression relative to the mean; green, low expression relative to the mean; R, relapse; and C, continuous remission.

Figure 1

Performance of the 42-probe-set (38-gene) gene expression classifier for prediction of RFS. (A-B) Kaplan-Meier survival estimates of RFS in the full cohort of 207 patients (A) and in the low- versus high-risk groups distinguished with the gene expression classifier for RFS (B). HR is the hazard ratio estimated using Cox regression. (C) A gene expression heatmap is shown with the rows representing the 42 probe sets (containing 38 unique genes) composing the gene expression classifier for RFS. The columns represent patient samples sorted from left to right by time to relapse or last follow-up. Red indicates high expression relative to the mean; green, low expression relative to the mean; R, relapse; and C, continuous remission.

A gene expression classifier predictive of survival

Gene expression profiles were obtained from pretreatment leukemic samples in each of the 207 high-risk ALL patients. To develop a gene expression–based classifier predictive of RFS, each of the 23 775 informative probe sets on the gene expression microarrays was ranked based on strength of association with RFS (Cox score).21  As detailed in supplemental Sections 4C, 5, and 8, a Cox proportional hazards model–based supervised principal component analysis was used to build the expression classifier for RFS, which was optimized by performing 20 iterations of 5-fold cross-validation.21  The final model incorporated the top 42 Affymetrix microarray probe sets corresponding to 38 unique genes (see supplemental Table 4 for the gene list; false discovery rate = 8.45%, significance analysis of microarrays [SAM]).24  The predicted gene expression classifier–based risk score for relapse for a given patient was computed via nested leave-one-out cross-validation (LOOCV) over the full model-building procedure (supplemental Sections 5 and 8). With a threshold of zero, the gene expression classifier–derived risk scores significantly separated the 207 high-risk ALL patients into low (4-year RFS, 81%; 95% CI, 72%-87%; n = 109) versus high (4-year RFS, 50%; 95% CI, 39%-60%; n = 98) risk groups (Figure 1B-C). Increased expression of BMPR1B, CTGF (CCN2), TTYH2, IGJ, NT5E (CD73), CDC42EP3, and TSPAN7, and decreased expression of NR4A3 (NOR-1), RGS1-2, and BTG3 were observed in the high gene expression risk group with the poorest outcome (Figure 1C). In a multivariate Cox regression analysis, the LRT revealed that the gene expression classifier for RFS provided significant independent information for outcome prediction, even after adjusting for flow MRD and WBC count (P = .001).

Improving risk classification and outcome prediction by combining the gene expression classifier and flow cytometric measures of MRD

Flow cytometric measures of MRD (flow MRD), measured at the end of induction therapy (day 29), were also capable of distinguishing 2 groups of patients with significantly different outcomes within the high-risk ALL cohort (Figure 2A).6  However, the independent prognostic impact of the gene expression–based classifier for RFS could further split both the flow MRD-negative patients (Figure 2B) and flow MRD-positive patients (Figure 2C) into 2 distinct patient groups with significantly different RFS (P = .001 and P = .005, respectively). It was particularly striking that the application of the gene expression classifier to the flow MRD-negative patients (Figure 2B) distinguished a group of high-risk ALL patients who did extremely well in the COG P9906 clinical trial(87% RFS at 4 years; 95% CI, 77%-93%). Similarly, applying the gene expression classifier to the flow MRD-positive patients distinguished a group of patients who did relatively well (68% RFS at 4 years; 95% CI, 47%-82%) from those who had an extremely poor outcome (Figure 2C). As both the gene expression classifier for RFS and flow MRD provided independent prognostic information in a multivariate Cox regression analysis (each P = .001), we built a combined risk classifier using these 2 variables; this combined classifier was capable of distinguishing 4 distinct prognostic groups within this cohort of high-risk ALL patients (Figure 2D). The 72 patients in the lowest risk group (38% of cases in the cohort; Table 2), who had low-risk gene expression classifier scores and negative end-induction flow MRD, showed significantly better RFS than the other groups (P < .001). Whereas all 20 cases with a t(1;19)(TCF3-PBX1) were contained within this lowest risk group (Figure 2D-E), it is of interest that another 52 patients lacking known recurring cytogenetic abnormalities were also assigned to this risk group (Table 2). Similarly, the 38 patients in the highest risk group (20% of cohort), who had high gene expression classifier risk scores and positive end-induction flow MRD, displayed significantly worse RFS (29% RFS at 4 years; 95% CI, 14%-46%, which continued to decline at 5 years; P < .001; Figures 2C-E; Table 2). No significant survival differences (P = .57) were observed among those with discordant predictors, either those patients with low gene expression classifier risk scores and positive end-induction flow MRD (28 of 191, 15% of cohort) or those with high gene expression classifier risk scores and negative end-induction flow MRD (52 of 191, 27% of cohort). These 2 groups were thus combined into an intermediate-risk group (Figure 2E). Figure 2E provides the Kaplan-Meier survival estimates for the 3 groups defined by the combined classifier and highlights the significant differences in RFS. These 3 risk groups varied significantly in age and in the presence of the known recurring cytogenetic abnormalities (Table 2). Whereas the 17 patients with MLL translocations were distributed within the low- and intermediate-risk groups, all 20 cases with t(1;19)(TCF3-PBX1) were in the lowest risk group, as discussed above (Table 2; Figure 2E). Interestingly, of the 8 relapses that occurred in the lowest risk group, all 8 were ALL cases with t(1;19)(TCF3-PBX1). Children in each of the 3 risk groups had similar proportions of relapse within the bone marrow or isolated to the central nervous system (Table 2).

Figure 2

Kaplan-Meier estimates of RFS based on the gene expression classifier for RFS and end-induction (day 29) MRD. (A) Day 29 flow cytometric measures of MRD separated patients into 2 groups with significantly different RFS. (B-C) After dividing patients by their end-induction flow MRD status, an independent effect of the gene expression classifier for RFS is observed among both the flow MRD-negative (< 0.01% blasts; B) and flow MRD-positive (> 0.01% blasts; C) patients. (D-E) Combining the risk scores determined from the gene expression classifier and flow MRD yields 4 distinct outcome groups; the 2 discordant groups show no significant difference in RFS (P = .572) and are therefore collapsed into an intermediate-risk group for RFS prediction (E). (E) The hazard ratios (HR) and corresponding P values are based on the Cox regression (medium-risk vs low-risk, HR = 3.73, P = .001; high-risk vs medium-risk, HR = 2.27, P = .002). The P value reported in the lower left corner corresponds to the test for differences among all groups.

Figure 2

Kaplan-Meier estimates of RFS based on the gene expression classifier for RFS and end-induction (day 29) MRD. (A) Day 29 flow cytometric measures of MRD separated patients into 2 groups with significantly different RFS. (B-C) After dividing patients by their end-induction flow MRD status, an independent effect of the gene expression classifier for RFS is observed among both the flow MRD-negative (< 0.01% blasts; B) and flow MRD-positive (> 0.01% blasts; C) patients. (D-E) Combining the risk scores determined from the gene expression classifier and flow MRD yields 4 distinct outcome groups; the 2 discordant groups show no significant difference in RFS (P = .572) and are therefore collapsed into an intermediate-risk group for RFS prediction (E). (E) The hazard ratios (HR) and corresponding P values are based on the Cox regression (medium-risk vs low-risk, HR = 3.73, P = .001; high-risk vs medium-risk, HR = 2.27, P = .002). The P value reported in the lower left corner corresponds to the test for differences among all groups.

Table 2

Clinical and genetic features of the three risk groups determined by the combined application of the gene expression classifier for RFS and flow cytometric measures of MRD

CharacteristicCombined risk group
Total cohortP (Fisher exact)
LowIntermediateHigh
RFS at 4 y 87% 62% 29% 61% < .001 
Number of cases 72 81 38 191  
Age, y      
    10 or older 56 (78%) 40 (49%) 29 (76%) 125 (65%) < .001 
    Younger than 10 16 (22%) 41 (51%) 9 (24%) 66 (35%)  
    Median 14.02 9.82 13.91 13.31  
    5th-95th percentiles 2.64-18.27 1.43-17.82 1.99-18.25 1.78-18.16  
Sex      
    Female 25 28 11 64  
    Male 47 53 27 127 .83 
WBC, K/μL      
    50 or more 30 50 19 99 .42 
    Less than 50 42 31 19 92  
WBC count      
    Median 37.25 92.7 51.55 62.3  
    5th-95th percentiles 2.3-246.4 3-314.8 2.3-478 2.3-314.8  
Race      
    Hispanic and Latino 17 16 13 46 .242 
    Others 54 64 25 143  
MLL      
    Negative 65 71 38 174 .057 
    Positive 10 17  
t(1;19)(TCF3-PBX1)      
    Negative 52 81 38 171 < .001 
    Positive 20 20  
CNS      
    No blasts 57 57 32 146 .457 
    Less than 5 blasts 14 25  
    5 or more blasts 10 20  
Relapse site      
    Isolated CNS* 15 23 .095 
    Marrow 13 17 35  
CharacteristicCombined risk group
Total cohortP (Fisher exact)
LowIntermediateHigh
RFS at 4 y 87% 62% 29% 61% < .001 
Number of cases 72 81 38 191  
Age, y      
    10 or older 56 (78%) 40 (49%) 29 (76%) 125 (65%) < .001 
    Younger than 10 16 (22%) 41 (51%) 9 (24%) 66 (35%)  
    Median 14.02 9.82 13.91 13.31  
    5th-95th percentiles 2.64-18.27 1.43-17.82 1.99-18.25 1.78-18.16  
Sex      
    Female 25 28 11 64  
    Male 47 53 27 127 .83 
WBC, K/μL      
    50 or more 30 50 19 99 .42 
    Less than 50 42 31 19 92  
WBC count      
    Median 37.25 92.7 51.55 62.3  
    5th-95th percentiles 2.3-246.4 3-314.8 2.3-478 2.3-314.8  
Race      
    Hispanic and Latino 17 16 13 46 .242 
    Others 54 64 25 143  
MLL      
    Negative 65 71 38 174 .057 
    Positive 10 17  
t(1;19)(TCF3-PBX1)      
    Negative 52 81 38 171 < .001 
    Positive 20 20  
CNS      
    No blasts 57 57 32 146 .457 
    Less than 5 blasts 14 25  
    5 or more blasts 10 20  
Relapse site      
    Isolated CNS* 15 23 .095 
    Marrow 13 17 35  

Only 191 of the 207 patients in the high-risk ALL cohort had flow MRD results at end induction; hence, this table reports on 191 total patients. Flow MRD results were available on only 17 of 21 MLL and 20 of 23 t(1;19)(TCF3-PBX1) patients.

CNS indicates central nervous system.

*

No association was seen between patients with isolated CNS relapse and those with CNS blasts at diagnosis (χ2 test, P = .93).

To assure that the gene expression classifier could improve outcome prediction in high-risk ALL patients lacking known recurring cytogenetic abnormalities, we built a second gene expression classifier for RFS using a subset of 163 of the original 207 COG 9906 high-risk ALL patients, excluding those cases with MLL (n = 21) or E2A-PBX1 translocations (n = 23), again using a Cox proportional hazards model–based supervised principal component analysis with extensive cross-validation (see supplemental Section 10). The resulting classifier for RFS contained 32 probe sets (29 unique genes; list provided in supplemental Table 8) and had a high degree of overlap (84%) with the genes in the initial classifier (supplemental Table 4). With a threshold of zero, the risk scores derived from this second classifier also significantly separated the 163 ALL cases into low- (4-year RFS, 76%; 95% CI, 64%-84%; n = 88) versus high- (4-year RFS, 52%; 95% CI, 40%-64%; n = 75) risk groups (P = .001; Figure 3A). Flow cytometric measures of end-induction MRD were also capable of distinguishing 2 risk groups within these 163 high-risk ALL cases (Figure 3B), and application of the gene expression classifier further divided both the flow MRD-negative (Figure 3C) and flow MRD-positive (Figure 3D) patients into distinct risk groups with significantly different outcomes. Combining this second classifier for RFS with end-induction flow MRD yielded 4 distinct risk groups with significantly different outcomes (P < .001; Figure 3E). As no significant survival differences were observed among the 2 groups with discordant predictors, these groups were combined into an intermediate-risk group (Figure 3F). As shown in Figure 3F, the Kaplan-Meier survival estimates for the 3 risk groups defined by this second combined classifier demonstrated highly significant differences in RFS (low [83% 4-year RFS; 95% CI, 70%-90%], intermediate [60% 4-year RFS; 95% CI, 44%-72%], and high [35% 4-year RFS; 95% CI, 19%-44%]; P < .001). These results demonstrate that gene expression classifiers significantly refine risk classification in high-risk ALL cases lacking known cytogenetic abnormalities.

Figure 3

Kaplan-Meier estimates of RFS based on the gene expression classifier for RFS modeled on high-risk ALL cases lacking known recurring cytogenetic abnormalities and end-induction (day 29) MRD. (A) The second gene expression classifier modeled only on those high-risk ALL cases (n = 163; supplemental Table 8) from the COG 9906 ALL cohort lacking recurring cytogenetic abnormalities resolves 2 distinct risk groups of patients with significantly different RFS. (B) Day 29 flow MRD status separated these 163 ALL cases into 2 groups with significantly different RFS. (C-D) After dividing patients by their end-induction flow MRD status, an independent effect of the gene expression classifier for RFS is observed among both the flow MRD-negative (< 0.01% blasts; C) and flow MRD-positive (> 0.01% blasts; D) patients. (E-F) Combining the risk scores determined from the gene expression classifier and flow MRD yields 4 distinct outcome groups (E); the 2 discordant groups show no significant difference in RFS and are therefore collapsed into an intermediate-risk group for RFS prediction (F). (F) The hazard ratios (HR) and corresponding P values are based on the Cox regression (high-risk vs intermediate-risk, HR = 2.26, P = .007; intermediate-risk vs low-risk, HR = 2.77, P = .008). The P value reported in the lower left corner corresponds to the test for differences among all groups.

Figure 3

Kaplan-Meier estimates of RFS based on the gene expression classifier for RFS modeled on high-risk ALL cases lacking known recurring cytogenetic abnormalities and end-induction (day 29) MRD. (A) The second gene expression classifier modeled only on those high-risk ALL cases (n = 163; supplemental Table 8) from the COG 9906 ALL cohort lacking recurring cytogenetic abnormalities resolves 2 distinct risk groups of patients with significantly different RFS. (B) Day 29 flow MRD status separated these 163 ALL cases into 2 groups with significantly different RFS. (C-D) After dividing patients by their end-induction flow MRD status, an independent effect of the gene expression classifier for RFS is observed among both the flow MRD-negative (< 0.01% blasts; C) and flow MRD-positive (> 0.01% blasts; D) patients. (E-F) Combining the risk scores determined from the gene expression classifier and flow MRD yields 4 distinct outcome groups (E); the 2 discordant groups show no significant difference in RFS and are therefore collapsed into an intermediate-risk group for RFS prediction (F). (F) The hazard ratios (HR) and corresponding P values are based on the Cox regression (high-risk vs intermediate-risk, HR = 2.26, P = .007; intermediate-risk vs low-risk, HR = 2.77, P = .008). The P value reported in the lower left corner corresponds to the test for differences among all groups.

A gene expression classifier predictive of end-induction flow MRD

The clinical application of a combined classifier using the gene expression classifier for RFS and day 29 flow MRD would require waiting until the end of induction therapy, precluding earlier intervention in patients who were destined to ultimately fail therapy. To develop a gene expression classifier predictive of end-induction MRD in diagnostic pretreatment specimens, 23 775 informative probe sets from 191 patients (of the 207 patients who had day 29 MRD results available) were ranked on their association with MRD (supplemental Sections 6 and 9). Using a threshold of 1% for the false discovery rate, SAM identified 352 probe sets significantly associated with positive end-induction flow MRD (supplemental Table 6). A diagonal linear discriminant analysis model22,23  predicting MRD was built and optimized by performing 100 iterations of 10-fold cross-validation. The final model incorporated the top 23 probe sets (21 unique genes; supplemental Table 5), which separated the patients into 2 groups with significantly different outcomes (log rank test, P = .014). Figure 4A shows the receiver operating characteristic (ROC) curve for the nested LOOCV predictions of the classifier. The 23 probe sets in the gene expression classifier predictive of end-induction MRD (Figure 4B) include the genes BAALC, P2RY5, TNFSF4, E2F8, IRF4, CDC42EP3, and KLF4, and 2 probe sets each for EPB41L2 and PARP15. When the gene expression classifier predictive of MRD was substituted for the day 29 flow MRD data and then combined with the expression classifier for RFS, 3 distinct risk groups were resolved that had significantly different RFS at 4 years (low-, 82%; intermediate-, 63%; and high-risk, 45%; Figure 4C). Whereas still highly statistically significant (P < .001), the combined classifier using the gene expression classifier for RFS and the gene expression classifier predicting end-induction MRD (Figure 4C) was slightly less discriminatory than the one combining the gene expression classifier for RFS and flow MRD (Figure 2E).

Figure 4

Gene expression classifier for prediction of end-induction (day 29) flow MRD in pretreatment samples combined with the gene expression classifier for RFS. (A) A ROC shows the high accuracy of the 23-probe-set MRD classifier (LOOCV error rate of 24.61%; sensitivity 71.64%, specificity 77.42%) in predicting MRD. The area under the ROC curve (0.80) is significantly greater than an uninformative ROC curve (0.5; P < .001). (B) Heatmap of 23-probe-set predictor of MRD presented in rows (false discovery rate < .001%, SAM). The columns represent patient samples with positive or negative end-induction flow MRD, whereas the rows are the specific predictor genes. Red: high expression relative to the mean; green: low expression relative to the mean. (C) Kaplan-Meier estimates of RFS for the risk groups determined by combining the gene expression classifiers for RFS and MRD, analogous to Figure 2E, with the gene expression predictor for MRD replacing day 29 flow MRD. The 3 risk groups have significantly different RFS (log rank test, P < .001).

Figure 4

Gene expression classifier for prediction of end-induction (day 29) flow MRD in pretreatment samples combined with the gene expression classifier for RFS. (A) A ROC shows the high accuracy of the 23-probe-set MRD classifier (LOOCV error rate of 24.61%; sensitivity 71.64%, specificity 77.42%) in predicting MRD. The area under the ROC curve (0.80) is significantly greater than an uninformative ROC curve (0.5; P < .001). (B) Heatmap of 23-probe-set predictor of MRD presented in rows (false discovery rate < .001%, SAM). The columns represent patient samples with positive or negative end-induction flow MRD, whereas the rows are the specific predictor genes. Red: high expression relative to the mean; green: low expression relative to the mean. (C) Kaplan-Meier estimates of RFS for the risk groups determined by combining the gene expression classifiers for RFS and MRD, analogous to Figure 2E, with the gene expression predictor for MRD replacing day 29 flow MRD. The 3 risk groups have significantly different RFS (log rank test, P < .001).

Validation of the classifiers in an independent dataset

We next determined whether the gene expression classifiers were predictive of outcome in a second independent cohort of 84 children with high-risk ALL treated on a different clinical trial (COG/Children's Cancer Group [CCG] 1961).14,19  In contrast to the initial COG 9906 high-risk ALL cohort, a WBC count more than 50 000/μL (LRT, P = .014) and male sex (LRT, P = .018) were associated with a worse RFS (supplemental Section 2).14,19  Flow MRD was not evaluated in the CCG 1961 trial. The initial 38-gene expression classifier for RFS (supplemental Table 4) that we developed from COG P9906 predicted a risk score among these 84 patients who were significantly associated with RFS (Cox proportional hazard regression, P = .006), even after adjusting for sex and WBC count (multivariate Cox regression, P = .01). The gene expression classifier risk scores split the 84 children from CCG 1961 into high (n = 28) and low (n = 56) risk groups (Figure 5A). Unlike our initial cohort, a significantly greater number of children with WBC counts > 50 000/μL was in the high (82%, 23 of 28) compared with the lower risk groups defined by the expression classifier (55%, 31 of 56; Fisher exact test, P = .017). Similar to the COG 9906 cohort, all children with t(1;19)(TCF3-PBX1) were in the lowest risk group, although this cytogenetic abnormality by itself did not predict RFS. We next tested the effect of the combined gene expression classifiers for RFS and MRD and were able to resolve 3 distinct risk groups with significantly different outcomes (Figure 5B), demonstrating that these classifiers were capable of resolving distinct risk groups in an independent cohort of children with high-risk ALL.

Figure 5

Kaplan-Meier estimates of RFS using the combined gene expression classifiers for RFS and MRD in an independent cohort of 84 children with high-risk ALL. (A) The gene expression classifier for RFS separates children into low- and high-risk groups in an independent cohort of 84 children with high-risk ALL treated on COG Trial 1961.14,16  (B) Application of the combined gene expression classifiers for RFS and MRD shows significant separation of 3 risk groups: low (47 of 84, 56%), intermediate (22 of 84, 26%), and high (15 of 84, 18%), similar to our initial cohort (Figure 3C).

Figure 5

Kaplan-Meier estimates of RFS using the combined gene expression classifiers for RFS and MRD in an independent cohort of 84 children with high-risk ALL. (A) The gene expression classifier for RFS separates children into low- and high-risk groups in an independent cohort of 84 children with high-risk ALL treated on COG Trial 1961.14,16  (B) Application of the combined gene expression classifiers for RFS and MRD shows significant separation of 3 risk groups: low (47 of 84, 56%), intermediate (22 of 84, 26%), and high (15 of 84, 18%), similar to our initial cohort (Figure 3C).

Gene expression classifiers retain independent prognostic significance in the presence of new genetic factors associated with a poor outcome in pediatric ALL

We and others have recently identified new genetic features in pediatric ALL that are associated with a poor outcome, including IKAROS/IKZF1 deletions,16 JAK mutations,17  and gene expression signatures reflective of activated tyrosine kinase signaling pathways (termed kinase signatures).16,18  Two of these studies16,18  first reported the discovery of ALL cases that lacked a classic BCR-ABL1 translocation, but that had gene expression profiles reflective of tyrosine kinase activation. Our more recent work17  has determined that the majority of these cases have activating mutations of the JAK family of tyrosine kinases. We thus wished to determine whether the gene expression classifier for RFS, or the combined classifier, retained independent prognostic significance in the presence of these genetic abnormalities. As detailed in “Statistical analyses,” our studies reporting IKAROS/IKZF1 deletions,16  activated kinase signatures,16  and JAK mutations17  used samples from the same COG 9906 high-risk ALL cohort; thus, we could readily perform this multivariate analysis.

As shown in Table 3, activated kinase signatures, JAK family mutations, and IKAROS/IKZF1 deletions were each significantly associated with the highest risk group as defined by the gene expression classifier for RFS in the COG 9906 high-risk ALL cases. Not only did the gene expression classifier for RFS assign all 38 cases with a kinase signature to the highest risk group, it also assigned another 60 cases to this risk group (Table 3). Similarly, whereas all cases with JAK mutations were assigned to the highest risk group by the gene expression classifier for RFS, an additional 74 cases lacking these mutations were also assigned to this high-risk group (Table 3). The gene expression classifier also refined risk classification in the presence of IKAROS/IKZF1 deletions (Table 3). In a multivariate Cox regression analysis, only the gene expression classifier for RFS (P = .005) and IKAROS/IKZF1 deletions (P = .003) retained prognostic significance (Table 4). A LRT determined that the gene expression classifier for RFS retained independent prognostic significance (P = .014) when adjusting for all other covariates. We also examined the association between risk groups as defined by the combined gene expression classifier for RFS and end-induction flow MRD (the combined classifier) with kinase signatures, JAK family mutations, and IKAROS/IKZF1 deletions (Table 5; Figure 6). Again, significant associations between each of these variables and the 3 risk groups (low, intermediate, and high) defined by the combined classifier were seen (Table 5). As shown in Figure 6, the application of the combined classifier refined risk classification and distinguished different patient groups with statistically significant different RFS in the presence or absence of a kinase signature (Figure 6A-B), in the presence or absence of JAK mutations (Figure 6C-D), and in the presence or absence of IKAROS/IKZF1 deletions (Figure 6E-F). In a multivariate Cox regression analysis (Table 6), only the combined classifier retained independent prognostic significance for outcome prediction. The LRT revealed that the combined classifier retained independent prognostic significance after adjusting for the effects of all other genetic abnormalities (P = .001).

Table 3

Association of kinase gene expression signatures, JAK mutations, and IKAROS/IKZF1 deletions with the low- versus high-risk groups defined by the gene expression classifier for RFS

Genetic featureRisk group determined by gene expression classifier for RFS
TotalP (Fisher exact)
Low riskHigh risk
Kinase signature     
    Yes 38 (39%) 38 (18%) < .001 
    No 109 60 (61%) 169 (82%)  
    Total 109 98 (100%) 207 (100%)  
JAK1/JAK2 mutation     
    Yes 19 (20%) 19 (10%) < .001 
    No 105 74 (100%) 179 (90%)  
    Total 105 93 (100%) 198 (100%)  
IKAROS/IKZF1 deletion     
    Yes 14 (13%) 41 (44%) 55 (28%) < .001 
    No 91 (87%) 52 (56%) 143 (72%)  
    Total 105 (100%) 93 (100%) 198 (100%)  
Genetic featureRisk group determined by gene expression classifier for RFS
TotalP (Fisher exact)
Low riskHigh risk
Kinase signature     
    Yes 38 (39%) 38 (18%) < .001 
    No 109 60 (61%) 169 (82%)  
    Total 109 98 (100%) 207 (100%)  
JAK1/JAK2 mutation     
    Yes 19 (20%) 19 (10%) < .001 
    No 105 74 (100%) 179 (90%)  
    Total 105 93 (100%) 198 (100%)  
IKAROS/IKZF1 deletion     
    Yes 14 (13%) 41 (44%) 55 (28%) < .001 
    No 91 (87%) 52 (56%) 143 (72%)  
    Total 105 (100%) 93 (100%) 198 (100%)  

The gene expression classifier for RFS used in this analysis is the initial classifier developed with 42 probe sets (38 unique genes) provided in supplemental Table 4.

Table 4

Multivariate Cox regression analysis of the prognostic significance of the risk group determined by the gene expression classifier for RFS in the presence of genetic factors in ALL associated with a poor outcome

CovariatesHazard ratio*
P
Estimate95% confidence interval
Gene expression classifier for RFS risk group    
    High risk versus low risk 2.380 2.3.6-4.338 .005 
IKAROS/IKZF1 deletions    
    Positive versus negative 2.237 1.316-3.803 .003 
JAK mutations    
    Positive versus negative 1.020 .500-2.081 .957 
Kinase gene expression signature    
    Positive versus negative 1.094 .590-2.030 .774 
CovariatesHazard ratio*
P
Estimate95% confidence interval
Gene expression classifier for RFS risk group    
    High risk versus low risk 2.380 2.3.6-4.338 .005 
IKAROS/IKZF1 deletions    
    Positive versus negative 2.237 1.316-3.803 .003 
JAK mutations    
    Positive versus negative 1.020 .500-2.081 .957 
Kinase gene expression signature    
    Positive versus negative 1.094 .590-2.030 .774 

The gene expression classifier for RFS used in this analysis is the initial classifier developed with 42 probe sets (38 unique genes) provided in supplemental Table 4.

*

Hazard ratios and corresponding P value are based on Cox regression.

Table 5

Association of kinase gene expression signatures, JAK mutations, and IKAROS/IKZF1 deletions with the three risk groups defined by the combined gene expression classifier for RFS and flow cytometric measures of MRD

Genetic featureCombined risk group
TotalP (Fisher exact)
LowIntermediateHigh
Kinase signature      
    Yes 13 (16%) 22 (58%) 35 (18%) < .001 
    No 72 (100%) 68 (84%) 16 (42%) 156 (82%)  
    Total 72 (100%) 81 (100%) 38 (100%) 191 (100%)  
JAK1/JAK2 mutation      
    Yes 9 (12%) 9 (24%) 18 (10%) < .001 
    No 69 (100%) 67 (88%) 28 (76%) 164 (90%)  
    Total 69 (100%) 76 (100%) 37 (100%) 182 (100%)  
IKAROS/IKZF1 deletion      
    Yes 9 (13%) 20 (26%) 25 (68%) 54 (30%) < .001 
    No 60 (87%) 56 (74%) 12 (32%) 128 (70%)  
    Total 69 (100%) 76 (100%) 37 (100%) 182 (100%)  
Genetic featureCombined risk group
TotalP (Fisher exact)
LowIntermediateHigh
Kinase signature      
    Yes 13 (16%) 22 (58%) 35 (18%) < .001 
    No 72 (100%) 68 (84%) 16 (42%) 156 (82%)  
    Total 72 (100%) 81 (100%) 38 (100%) 191 (100%)  
JAK1/JAK2 mutation      
    Yes 9 (12%) 9 (24%) 18 (10%) < .001 
    No 69 (100%) 67 (88%) 28 (76%) 164 (90%)  
    Total 69 (100%) 76 (100%) 37 (100%) 182 (100%)  
IKAROS/IKZF1 deletion      
    Yes 9 (13%) 20 (26%) 25 (68%) 54 (30%) < .001 
    No 60 (87%) 56 (74%) 12 (32%) 128 (70%)  
    Total 69 (100%) 76 (100%) 37 (100%) 182 (100%)  

The gene expression classifier for RFS used in this analysis is the initial classifier developed with 42 probe sets (38 unique genes) provided in supplemental Table 4.

Figure 6

Kaplan-Meier estimates of RFS using the combined gene expression classifier for RFS and flow cytometric measures of MRD in the presence of kinase signatures, JAK mutations, and IKAROS/IKZF1 deletions. (A-B) Application of the original 42-probe-set (38-gene; supplemental Table 4) gene expression classifier for RFS combined with end-induction flow cytometric measures of MRD distinguishes 2 distinct risk groups in COG 9906 ALL patients with kinase signatures (A) and 3 risk groups in those patients lacking kinase signatures (B). (C-D) Application of the combined classifier also resolves 2 distinct and statistically significant risk groups in ALL patients with JAK mutations (C) and in 3 risk groups in those patients lacking JAK mutations (D). (E-F) Application of the combined classifier distinguishes 3 risk groups with statistically significant RFS and patients with (E) and without IKAROS/IKZF1 deletions. The P value reported in the lower left corner corresponds to the log rank test for differences among all groups.

Figure 6

Kaplan-Meier estimates of RFS using the combined gene expression classifier for RFS and flow cytometric measures of MRD in the presence of kinase signatures, JAK mutations, and IKAROS/IKZF1 deletions. (A-B) Application of the original 42-probe-set (38-gene; supplemental Table 4) gene expression classifier for RFS combined with end-induction flow cytometric measures of MRD distinguishes 2 distinct risk groups in COG 9906 ALL patients with kinase signatures (A) and 3 risk groups in those patients lacking kinase signatures (B). (C-D) Application of the combined classifier also resolves 2 distinct and statistically significant risk groups in ALL patients with JAK mutations (C) and in 3 risk groups in those patients lacking JAK mutations (D). (E-F) Application of the combined classifier distinguishes 3 risk groups with statistically significant RFS and patients with (E) and without IKAROS/IKZF1 deletions. The P value reported in the lower left corner corresponds to the log rank test for differences among all groups.

Table 6

Multivariate Cox regression analysis of the prognostic significance of the risk group determined by the combined gene expression classifier for RFS and flow cytometric measures of MRD in the presence of genetic factors in ALL associated with a poor outcome

CovariatesHazard ratio*
P
Estimate95% confidence interval
Risk group determined by gene expression classifier for RFS and flow MRD    
    Intermediate versus low risk 3.366 1.569-7.222 .002 
    High versus low risk 6.214 2.547-15.160 .000 
IKAROS/IKZF1 deletions    
    Positive versus negative 1.684 .923-3.072 .089 
JAK mutations    
    Positive versus negative .987 .469-2.076 .973 
Kinase gene expression signature    
    Positive versus negative .988 .506-1.929 .972 
CovariatesHazard ratio*
P
Estimate95% confidence interval
Risk group determined by gene expression classifier for RFS and flow MRD    
    Intermediate versus low risk 3.366 1.569-7.222 .002 
    High versus low risk 6.214 2.547-15.160 .000 
IKAROS/IKZF1 deletions    
    Positive versus negative 1.684 .923-3.072 .089 
JAK mutations    
    Positive versus negative .987 .469-2.076 .973 
Kinase gene expression signature    
    Positive versus negative .988 .506-1.929 .972 

The gene expression classifier for RFS used in this analysis is the initial classifier developed with 42 probe sets (38 unique genes) provided in supplemental Table 4.

*

Hazard ratios and corresponding P value are based on Cox regression.

Discussion

Whereas gene expression–profiling studies in the acute leukemias have identified gene expression signatures associated with recurrent cytogenetic abnormalities8,25,26  and in vitro drug responsiveness,9-11,15  fewer studies have reported and validated gene expression classifiers predictive of survival.13,14  In this study, gene expression classifiers predictive of RFS and end-induction MRD were derived from the gene expression profiles obtained in the pretreatment samples of 207 children with B-precursor high-risk ALL. A 42-probe-set (containing 38 unique genes) expression classifier predictive of RFS was capable of resolving 2 distinct groups of patients with significantly different outcomes within the category of pediatric ALL patients traditionally defined as high risk. In multivariate analyses, only the gene expression–based classifier for RFS and flow cytometric measures of end-induction MRD provided independent prognostic information for outcome prediction. By combining the risk scores derived from the gene expression classifier for RFS with end-induction flow MRD, 3 distinct groups of patients with strikingly different treatment outcomes could be identified. Similar results were obtained when modeling only those high-risk ALL cases that lacked any known recurring cytogenetic abnormalities.

Perhaps most importantly, in terms of the future potential clinical utility of gene expression-based classifiers for risk classification, we further demonstrated that both the gene expression classifier for RFS and the combination of this classifier with end-induction flow MRD retained independent prognostic significance for outcome prediction in the presence of new genetic abnormalities that we and others have recently discovered and found to be associated with a poor outcome in pediatric ALL (IKAROS/IKZF1 deletions, JAK mutations, and kinase signatures). The combined classifier further refined outcome prediction in the presence of each of these mutations or signatures, distinguishing which cases with JAK mutations, kinase signatures, or IKAROS/IKZF1 deletions would have a good (low-risk), intermediate, or poor (high-risk) outcome (Table 5; Figure 6). Thus, whereas IKZF1 deletions and JAK mutations are exciting new targets for the development of novel therapeutic approaches in pediatric ALL, assessment of these genetic abnormalities alone may not be fully sufficient for risk classification or to predict overall outcome. As gene expression profiles reflect the full constellation and consequence of the multiple genetic abnormalities seen in each ALL patient and as measures of MRD are a functional biologic measure of residual or resistant leukemic cells, they may have an enhanced clinical utility for refinement of risk classification and outcome prediction.

The results reported in this study, as well as those of other recent studies,16-18  reveal the striking molecular and biologic heterogeneity within children who have traditionally been classified as high-risk ALL. Unexpectedly, 72 of 207 (38%) of the high-risk ALL patients studied in the COG 9906 ALL cohort were found by the combined gene expression classifier for RFS and flow MRD classifier to have a significantly better survival (87% RFS at 4 years) compared with the entire cohort (66% survival at 4 years). This group of patients, which included all 20 cases with t(1;19)(TCF3-PBX1) and an additional 52 cases whose underlying genetic abnormalities remain to be discovered, was characterized by high expression of the tumor suppressor genes and signaling proteins RGS2, NFKBIB, NR4A3, DDX21, and BTG3.27-30  Application of the combined classifier also identified 38 of 207 (20%) patients in the COG 9906 cohort who had a dismal 4-year RFS of 29% (approaching 0% at 5 years). Highly expressed in this group of patients with the worst outcome were genes (BMPR1B, CTGF [CCN2], TTYH2, IGJ, PON2, CD73, CDC42EP3, TSPAN7, and SEMA6A) involved in adaptive cell signaling responses to transforming growth factor β, stem cell function, B-cell development and differentiation, and the regulation of tumor growth.27-45  These highest risk cases lacked expression of the genes (NR4A3, BTG3, RGS1, and RGS2) whose relatively high expression characterized the ALL cases with the best outcome. Not surprisingly, given that all cases with an activated kinase signature were assigned to the highest risk group with the combined classifier, 6 of the genes associated with our kinase signature (BMPR1B, ECM1, IGJ, PON2, SEMA6A, and TSPAN7) were contained within our gene expression classifier for RFS. The genes that characterize the risk groups defined by the combined classifier provide important clues to the multiple complex pathways and mechanisms of leukemic transformation in pediatric ALL.

The kinetics of early treatment response, best assessed by molecular or flow cytometric measures of MRD after the first 1 to 3 months of therapy, are a potent predictor of outcome in leukemia. Yet, MRD data are not available at initial diagnosis, and relapses occur in some pediatric ALL patients (such as those with t[1;19][TCF3-PBX1]) who have an excellent (negative) end-induction MRD response. Ideally, one would want to identify as early as possible those ALL patients who are most likely to fail therapy so that novel treatment interventions or alternative induction methods could be used. Using the combined gene expression classifier for RFS and end-induction flow MRD, we identified 38 patients in the initial cohort of 207 patients who were destined to ultimately fail intensified traditional therapy for ALL. We therefore built a 23-probe-set (21-gene) gene expression classifier predictive of day 29 flow MRD in diagnostic, pretreatment samples that could successfully replace end-induction flow MRD in our risk model. Among several interesting genes in the classifier predictive of end-induction MRD was BAALC, a novel marker of early progenitor cells that has been reported to confer a worse outcome and primary resistance in acute leukemia, including ALL and acute myeloid leukemia in adults.46,47  Given the relatively old age (mean = 13 years) of the children and adolescents in our ALL cohort and the presence of genes in our gene expression classifiers for RFS and MRD that have previously been associated with a poor outcome in adult ALL (such as CTGF43,44  and BAALC46,47 ), we hypothesize that the gene expression classifiers that we have developed for pediatric ALL may also be useful for risk classification and outcome prediction in adults with ALL. These studies are now in progress.

The results of our studies provide evidence that improved outcome prediction and risk classification can be achieved in ALL through the development of gene expression classifiers. The application of gene expression classifiers allows for the prospective identification of a significant subgroup of ALL patients with little chance for cure on contemporary chemotherapeutic regimens. Further analysis of these expression profiles, coupled with other comprehensive genomic studies, will hopefully lead to the continued identification of novel targets and more effective therapies for these children.

The online version of this article contains a data supplement.

The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

Acknowledgments

This work was supported by National Institutes of Health NCI U01 CA114762 Strategic Partnerships to Evaluate Cancer Gene Signatures Program (C.L.W.), NCI U10 CA98543 supporting the Children's Oncology Group and Statistics and Data Center (G.H.R.), and a subcontract to NCI U10 CA98543 in support of the National Cancer Institute TARGET Initiative. Additional funding was provided by a Leukemia & Lymphoma Society Specialized Center of Research Program Grant 7388-06 (C.L.W.). University of New Mexico Cancer Center Shared Resources: Keck-UNM Genomics Resource, Biostatistics, and Bioinformatics and Computational Biology, partially supported by NCI P30 CA118100 (C.L.W.), were also critical for this work.

National Institutes of Health

Authorship

Contribution: H.K. performed statistical analyses, designed and developed classifiers, and prepared the manuscript; I.-M.C. and R.C.H. performed leukemia sample processing, gene expression arrays, and correlative data analysis; C.S.W. and W.W. performed data analysis and review and prepared the manuscript; E.J.B. performed statistical analyses and designed and developed classifiers; S.R.A. conducted statistical analyses, data analysis, and review; M.D. conducted COG clinical and statistical analyses, data review, and analysis; C.G.M. completed IKAROS, collaborated in JAK studies, and reviewed the data; X.W. performed statistical analyses, model building, data interpretation, and review; M.M. conducted statistical analyses and database and data warehouse development; K.A. technically performed RNA isolations/microarrays; M.J.B. performed research, data analysis, and review, and prepared the manuscript; W.P.B. designed COG studies and conducted data analysis and review; D.B. completed arrays on independent dataset and performed data analysis and review; W.L.C. designed COG studies and performed data analysis, review, and arrays from independent cohort; B.M.C. designed COG studies, performed data analysis and review, and prepared the manuscript; G.H.R. designed COG and CCG studies and performed data analysis and review; M.A.S. coordinated National Cancer Institute TARGET studies for JAK/IKAROS analyses and participated in data review; J.R.D. completed IKAROS, collaborated in JAK studies, and performed data review; S.P.H. designed COG studies, performed data analysis and review, and prepared the manuscript; and C.L.W. oversaw all aspects of this project, performed data analysis and review and statistical analysis review, and prepared the manuscript.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Cheryl L. Willman, University of New Mexico Cancer Research Facility, 2325 Camino de Salud NE, Room G03, MSC08 4630 1 University of New Mexico, Albuquerque, NM 87131; e-mail: cwillman@salud.unm.edu.

References

1
Pui
 
CH
Evans
 
WE
Drug therapy: treatment of acute lymphoblastic leukemia.
N Engl J Med
2006
, vol. 
354
 (pg. 
166
-
178
)
2
Pui
 
CH
Robison
 
LL
Look
 
AT
Acute lymphoblastic leukemia.
Lancet
2008
, vol. 
371
 (pg. 
1030
-
1043
)
3
Pui
 
CH
Pei
 
DQ
Sandlund
 
JT
, et al. 
Risk of adverse events after completion of therapy for childhood acute lymphoblastic leukemia.
J Clin Oncol
2005
, vol. 
23
 (pg. 
7936
-
7941
)
4
Schultz
 
KR
Pullen
 
DJ
Sather
 
HN
, et al. 
Risk- and response-based classification of childhood B-precursor acute lymphoblastic leukemia: a combined analysis of prognostic markers from the Pediatric Oncology Group (POG) and Children's Cancer Group (CCG).
Blood
2007
, vol. 
109
 (pg. 
926
-
935
)
5
Smith
 
M
Arthur
 
D
Camitta
 
B
, et al. 
Uniform approach to risk classification and treatment assignment for children with acute lymphoblastic leukemia.
J Clin Oncol
1996
, vol. 
14
 (pg. 
18
-
24
)
6
Borowitz
 
MJ
Devidas
 
M
Hunger
 
SP
, et al. 
Clinical significance of minimal residual disease in childhood acute lymphoblastic leukemia and its relationship to other prognostic factors: a Children's Oncology Group study.
Blood
2008
, vol. 
111
 (pg. 
5477
-
5485
)
7
Pui
 
CH
Jeha
 
S
New therapeutic strategies for the treatment of acute lymphoblastic leukemia.
Nat Rev Drug Discov
2007
, vol. 
6
 (pg. 
149
-
165
)
8
Yeoh
 
EJ
Ross
 
ME
Shurtleff
 
SA
, et al. 
Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling.
Cancer Cell
2002
, vol. 
1
 (pg. 
133
-
143
)
9
Cheok
 
MH
Yang
 
WL
Pui
 
CH
, et al. 
Treatment-specific changes in gene expression discriminate in vivo drug response in human leukemia cells.
Nat Genet
2003
, vol. 
34
 (pg. 
85
-
90
)
10
Holleman
 
A
Cheok
 
MH
den Boer
 
ML
, et al. 
Gene-expression patterns in drug-resistant acute lymphoblastic leukemia cells and response to treatment.
N Engl J Med
2004
, vol. 
351
 (pg. 
533
-
542
)
11
Lugthart
 
S
Cheok
 
MH
den Boer
 
ML
, et al. 
Identification of genes associated with chemotherapy crossresistance and treatment response in childhood acute lymphoblastic leukemia.
Cancer Cells
2005
, vol. 
7
 (pg. 
375
-
386
)
12
Mullighan
 
CG
Goorha
 
S
Radtke
 
I
, et al. 
Genome-wide analysis of genetic alterations in acute lymphoblastic leukemia.
Nature
2007
, vol. 
446
 (pg. 
758
-
764
)
13
Flotho
 
C
Coustan-Smith
 
E
Pei
 
DQ
, et al. 
A set of genes that regulate cell proliferation predicts treatment outcome in childhood acute lymphoblastic leukemia.
Blood
2007
, vol. 
110
 (pg. 
1271
-
1277
)
14
Bhojwani
 
D
Kang
 
H
Menezes
 
RX
, et al. 
Gene expression signatures predictive of early response and outcome in high-risk childhood acute lymphoblastic leukemia: a Children's Oncology Group Study on behalf of the Dutch Childhood Oncology Group and the German Cooperative Study Group for Childhood Acute Lymphoblastic Leukemia.
J Clin Oncol
2008
, vol. 
26
 (pg. 
4376
-
4384
)
15
Sorich
 
MJ
Pottier
 
N
Pei
 
D
, et al. 
In vivo response to methotrexate forecasts outcome of acute lymphoblastic leukemia and has a distinct gene expression profile.
PLoS Med
2008
, vol. 
5
 (pg. 
646
-
656
)
16
Mullighan
 
CG
Su
 
X
Zhang
 
J
, et al. 
Deletion of IKZF1 and prognosis in acute lymphoblastic leukemia.
N Engl J Med
2009
, vol. 
360
 (pg. 
470
-
480
)
17
Mullighan
 
CG
Zhang
 
J
Harvey
 
RC
, et al. 
JAK mutations in high-risk childhood acute lymphoblastic leukemia.
Proc Natl Acad Sci U S A
2009
, vol. 
106
 (pg. 
9414
-
9418
)
18
Den Boer
 
ML
van Slegtenhorst
 
M
De Menezes
 
RX
, et al. 
A subtype of childhood acute lymphoblastic leukemia with poor treatment outcome: a genome-wide classification study.
Lancet Oncol
2009
, vol. 
10
 (pg. 
125
-
134
)
19
Nachman
 
JB
Sather
 
HN
Sensel
 
MG
, et al. 
Augmented post-induction therapy for children with high-risk acute lymphoblastic leukemia and a slow response to initial therapy.
N Engl J Med
1998
, vol. 
338
 (pg. 
1663
-
1671
)
20
Shuster
 
JJ
Camitta
 
BM
Pullen
 
J
, et al. 
Identification of newly diagnosed children with acute lymphocytic leukemia at high risk for relapse.
Can Res Ther Contr
1999
, vol. 
9
 (pg. 
101
-
107
)
21
Bair
 
E
Hastie
 
T
Paul
 
D
Tibshirani
 
R
Prediction by supervised principal components.
J Am Stat Assoc
2006
, vol. 
101
 (pg. 
119
-
137
)
22
Asgharzadeh
 
S
Pique-Regi
 
R
Sposto
 
R
, et al. 
Prognostic significance of gene expression profiles of metastatic neuroblastomas lacking MYCN gene amplification.
J Natl Cancer Inst
2006
, vol. 
98
 (pg. 
1193
-
1203
)
23
Simon
 
R
Development and evaluation of therapeutically relevant predictive classifiers using gene expression profiling.
J Natl Cancer Inst
2006
, vol. 
98
 (pg. 
1169
-
1171
)
24
Tusher
 
VG
Tibshirani
 
R
Chu
 
G
Significance analysis of microarrays applied to the ionizing radiation response.
Proc Natl Acad Sci U S A
2001
, vol. 
98
 (pg. 
5116
-
5121
)
25
Ross
 
ME
Zhou
 
X
Song
 
G
, et al. 
Classification of pediatric acute lymphoblastic leukemia by gene expression profiling.
Blood
2003
, vol. 
102
 (pg. 
2951
-
2959
)
26
Martin
 
SB
Mosquera-Caro
 
MP
Potter
 
JW
, et al. 
Gene expression overlap affects karyotype prediction in pediatric acute lymphoblastic leukemia.
Leukemia
2007
, vol. 
21
 (pg. 
1341
-
1344
)
27
Mullican
 
SE
Zhang
 
S
Konopleva
 
M
, et al. 
Abrogation of nuclear receptors Nr4a3 and Nr4a1 leads to development of acute myeloid leukemia.
Nat Med
2007
, vol. 
13
 (pg. 
730
-
735
)
28
Schwable
 
J
Choudhary
 
C
Thiede
 
C
, et al. 
RGS2 is an important target gene of Flt3-ITD mutations in AML and functions in myeloid differentiation and leukemic transformation.
Blood
2005
, vol. 
105
 (pg. 
2107
-
2114
)
29
Gottardo
 
NG
Hoffmann
 
K
Beesley
 
AH
, et al. 
Identification of novel molecular prognostic markers for pediatric T-cell acute lymphoblastic leukemia.
Br J Haematol
2007
, vol. 
137
 (pg. 
319
-
328
)
30
Agenes
 
F
Bosco
 
N
Mascarell
 
L
Fritah
 
S
Ceredig
 
R
Differential expression of regulator of G-protein signalling transcripts and in vivo migration of CD4+ naive and regulatory T cells.
Immunology
2005
, vol. 
115
 (pg. 
179
-
188
)
31
Horke
 
S
Witte
 
I
Wilgenbus
 
P
Kruger
 
M
Strand
 
D
Forstermann
 
U
Paraoxonase-2 reduces oxidative stress in vascular cells and decreases endoplasmic reticulum stress-induced caspase activation.
Circulation
2007
, vol. 
115
 (pg. 
2055
-
2064
)
32
Gomis
 
RR
Alarcon
 
C
He
 
W
, et al. 
A FoxO-Smad synexpression group in human keratinocytes.
Proc Natl Acad Sci U S A
2006
, vol. 
103
 (pg. 
12747
-
12752
)
33
Chen
 
P-S
Wang
 
M-Y
Wu
 
S-N
, et al. 
CTGF enhances the motility of breast cancer cells via an integrin-αvβ3-ERK1/2-dependent S100A4-up-regulated pathway.
J Cell Sci
2007
, vol. 
120
 (pg. 
2053
-
2065
)
34
Wang
 
L
Zhou
 
X
Zhou
 
T
, et al. 
Ecto-5′-nucleotidase promotes invasion, migration and adhesion of human breast cancer cells.
J Cancer Res Clin Oncol
2008
, vol. 
134
 (pg. 
365
-
372
)
35
Kodach
 
LL
Bleurning
 
SA
Musler
 
AR
, et al. 
The bone morphogenetic protein pathway is active in human colon adenomas and inactivated in colorectal cancer.
Cancer
2008
, vol. 
112
 (pg. 
300
-
306
)
36
Rae
 
FK
Hooper
 
JD
Eyre
 
HJ
Sutherland
 
GR
Nicol
 
DL
Clements
 
JA
TTYH2, a human homologue of the Drosophila melanogaster gene tweety, is located on 17q24 and up-regulated in renal cell carcinoma.
Genomics
2001
, vol. 
77
 (pg. 
200
-
207
)
37
Toiyama
 
Y
Mizoguchi
 
A
Kimura
 
K
, et al. 
TTYH2, a human homologue of the Drosophila melanogaster gene tweety, is up-regulated in colon carcinoma and involved in cell proliferation and cell aggregation.
World J Gastroenterol
2007
, vol. 
13
 (pg. 
2717
-
2721
)
38
Dunne
 
J
Cullmann
 
C
Ritter
 
M
, et al. 
siRNA-mediated AML1/MTG8 depletion affects differentiation and proliferation-associated gene expression in t(8;21)-positive cell lines and primary AML blasts.
Oncogene
2006
, vol. 
25
 (pg. 
6067
-
6078
)
39
Assou
 
S
Le Carrour
 
T
Tondeur
 
S
, et al. 
A meta-analysis of human embryonic stem cells transcriptome integrated into a web-based expression atlas.
Stem Cells
2007
, vol. 
25
 (pg. 
961
-
973
)
40
Mageed
 
AS
Pietryga
 
DW
DeHeer
 
DH
West
 
RA
Isolation of large numbers of mesenchymal stem cells from the washings of bone marrow collection bags: characterization of fresh mesenchymal stem cells.
Transplantation
2007
, vol. 
83
 (pg. 
1019
-
1026
)
41
Deaglio
 
S
Dwyer
 
KM
Gao
 
W
, et al. 
Adenosine generation catalyzed by CD39 and CD73 expressed on regulatory T cells mediates immune suppression.
J Exp Med
2007
, vol. 
204
 (pg. 
1257
-
1265
)
42
Mikhailov
 
A
Sokolovskaya
 
A
Yegutkin
 
GG
, et al. 
CD73 participates in cellular multiresistance program and protects against TRAIL-induced apoptosis.
J Immunol
2008
, vol. 
181
 (pg. 
464
-
475
)
43
Sala-Torra
 
O
Gundacker
 
HM
Stirewalt
 
DL
, et al. 
Connective tissue growth factor (CTGF) expression and outcome in adult patients with acute lymphoblastic leukemia.
Blood
2007
, vol. 
109
 (pg. 
3080
-
3083
)
44
Boag
 
JM
Beesley
 
AH
Firth
 
MJ
, et al. 
High expression of connective tissue growth factor in pre-B acute lymphoblastic leukemia.
Br J Haematol
2007
, vol. 
138
 (pg. 
740
-
748
)
45
Hoffmann
 
K
Firth
 
MJ
Beesley
 
AH
, et al. 
Prediction of relapse in pediatric pre-B acute lymphoblastic leukemia using a three-gene risk index.
Br J Haematol
2008
, vol. 
140
 (pg. 
656
-
664
)
46
Baldus
 
CD
Martus
 
P
Burmeister
 
T
, et al. 
Low ERG and BAALC expression identifies a new subgroup of adult acute T-lymphoblastic leukemia with a highly favorable outcome.
J Clin Oncol
2007
, vol. 
25
 (pg. 
3739
-
3745
)
47
Langer
 
C
Radmacher
 
MD
Ruppert
 
AS
, et al. 
High BAALC expression associates with other molecular prognostic markers, poor outcome, and a distinct gene-expression signature in cytogenetically normal patients younger than 60 years with acute myeloid leukemia: a Cancer and Leukemia Group B (CALGB) study.
Blood
2008
, vol. 
111
 (pg. 
5371
-
5379
)

Supplemental data