We investigated treatment with gemtuzumab ozogamicin (GO) in 51 patients aged 65 years or older with newly diagnosed acute myeloid leukemia (AML), refectory anemia (RA) with excess of blasts in transformation, or RA with excess blasts. GO was given in doses of 9 mg/m^{2} of body-surface area on days 1 and 8 or, therapeutically equivalently, on days 1 and 15, with or without interleukin 11 (IL-11; 15 μg/kg per day on days 3 to 28), with assignment to IL-11 treatment made randomly. Complete remission (CR) rates were 2 of 26 (8%) for GO without IL-11 and 9 of 25 (36%) for GO with IL-11. Regression analyses indicated that IL-11 was independently predictive of CR but not survival. We compared GO with or without IL-11 with idarubicin plus cytosine arabinoside (IA), as previously administered, in similar patients. The CR rate with IA was 15 of 31 (48%), and survival was superior with IA compared with GO with or without IL-11 (*P* = .03). Besides accounting for possible covariate effects on outcome, we also accounted for possible trial effects (TEs) arising because IA and GO with or without IL-11 were not arms of a randomized trial. Bayesian posterior probabilities that GO with or without IL-11 produced longer survival than IA, after accounting for covariates and TEs, were less than 0.01 in patients with abnormal cytogenetic findings (AC) and less than 0.15 in patients with normal cytogenetic findings (NC). Regarding CR, the analogous probabilities were less than 0.02 for GO without IL-11 (all cytogenetic groups), and for GO with IL-11, less than 0.25 for AC groups and about 0.50 for NC groups. TEs 2 to 5 times the magnitude of those previously observed would be needed to conclude that survival with GO with or without IL-11 is likely longer than with IA. Thus, there is little evidence to suggest that GO with or without IL-11 should be used instead of IA in older patients with newly diagnosed AML or myelodysplastic syndrome.

## Introduction

On the basis of evidence that it was less toxic than, but as effective as, standard, cytosine arabinose–containing regimens in patients 60 years of age or older with acute myeloid leukemia (AML) in first relapse,1 gemtuzumab ozogamicin (GO; Mylotarg; Wyeth-Ayerst Pharmaceuticals, Madison, NJ), a combination of an anti-CD33 antibody and the cytotoxic agent calicheamicin, has been approved for use in such patients. Because there is doubt that the benefit-risk ratios associated with such standard regimens are favorable even in newly diagnosed elderly patients,2 a trial of GO in older patients with untreated AML appeared reasonable.

In this study, we compared the complete remission (CR) and survival rates with GO, adjusted for the effects of covariates such as performance status and cytogenetic findings, with the rates we previously observed in similarly aged (≥ 65 years) newly diagnosed patients given idarubicin plus cytosine arabinoside (IA). Because IA and GO were not administered in arms of a randomized trial, any difference between them may reflect not only treatment or a more favorable configuration of known prognostic covariates in 1 of the 2 trials but also differences between the trials with respect to unknown (“latent”) variables. We here refer to the difference between average patient outcomes due to such latent variables in 2 trials as a “trial effect” (TE). By definition, the TE accounts for the difference in average covariate-adjusted outcome observed in 2 trials of the same treatment. Comparing the outcome rates for 2 treatments studied in separate trials, such as IA and GO, by adjusting for known patient covariates by means of statistical regression, yields an estimated effect that is the sum of the actual treatment effect and the TE. The treatment effects and TEs are confounded, and neither can be estimated individually. To deal with this problem, we estimated the magnitude of the TEs observed in pairs of previous trials at MD Anderson of the same treatment in patients with newly diagnosed AML or myelodysplastic syndrome (MDS). We used this historical information to perform a sensitivity analysis that provides a plausible estimate of the actual covariate-adjusted GO-versus-IA (GO-IA) treatment effect. Although not a substitute for results from a randomized trial, the findings of our analyses were sufficiently striking to warrant reporting.

## Patients and methods

### GO trial

Patients had newly diagnosed AML, refractory anemia (RA) with excess of blasts in transformation (RAEB-t), or RA with excess blasts (RAEB) (French-American-British criteria); were at least 65 years of age; and had a karyotype other than inv(16), t(8;21), or t(15;17). Fifty-one patients met these criteria and received GO. Forty-nine of the 51 were CD33^{+}, with a median of 94% (range, 24%-100%) of blasts expressing CD33. The 50th patient had CD33 expression on 7% of blasts and the 51st patient did not have a CD33 assessment; however, since January 1, 2000, 94% of the 126 patients older than 64 years with AML, RAEB-t, or RAEB without inv(16), t(8;21), or t(15;17) have been CD33^{+}.

The initial 22 patients received 9 mg/m^{2} of body-surface area GO on days 1 and 15, provided the marrow specimen obtained on day 14 was more than 10% cellular. Otherwise, the marrow was examined weekly until the response was known. If CR as defined by conventional criteria was not apparent, patients received other therapies. When only 5 of the 22 patients had CR, the remaining 29 patients received 9 mg/m^{2} GO on days 1 and 8. All patients were randomly assigned to receive or not receive 15 μg/kg of body weight per day interleukin-11 (IL-11) on days 3 to 28. Supportive care measures, including use of a laminar-airflow room (protected environment [PE]), were described previously.3 In remission, patients received one course of GO, cyclosporine A, fludarabine, and cytosine arabinoside (ara-C) alternating every 5 weeks for 10 months with one course of IA. Doses were 6 mg/m^{2} GO on day 1; 6 mg/kg cyclosporine A followed by 16 mg/kg daily on days 1 and 2 (continuous infusion); 15 mg/m^{2} fludarabine twice daily on days 2 to 6; and 0.5 g/m^{2} ara-C twice daily on days 2 to 6. IA treatment consisted of 8 mg/m^{2} idarubicin daily on days 1 and 2 and 1.5 g/m^{2} ara-C daily on days 1 and 2 (continuous infusion).

### IA trial

We compared the CR and survival rates observed with GO with or without IL-11 with those observed in 31 patients previously (1991-1992) given IA (12 mg/m^{2} idarubicin daily on days 1-3 and 1.5 g/m^{2} ara-C daily on days 1-3 [continuous infusion]) for remission induction and, once in CR, ara-C (100 mg/m^{2} daily for 5 days [continuous infusion]) alternating every 5 weeks for 10 months with IA as described above for the GO program. All 31 patients met the eligibility criteria for the trial of GO with or without IL-11; CD33 was positive in all 24 of the 31 patients given IA in whom it was assessed (median, 96%; range, 36%-100%). We chose IA rather than other regimens frequently employed at our institution in the 1990s—namely, fludarabine with ara-C with or without idarubicin (FA) or topotecan with ara-C (TA)—as the standard for comparison with GO because of data suggesting that IA is superior to either FA or TA.3 Although 78 historical patients given IA with or without granulocyte colony-stimulating factor (G-CSF), lisofylline, or all-*trans* retinoic acid (ATRA) met the eligibility criteria for the GO trial, we limited the GO-IA comparison to the 31 patients among the 78 who received IA without these adjuvants. We did so because IA is more commonly used and produced results similar to those with IA plus G-CSF and ATRA.3 Moreover, focusing on IA alone enhanced the clarity of presentation. Approval for the mylotarg ± IL-11 and the IA trials was obtained from the Institutional Review Board for these studies. Informed consent was provided according to the Declaration of Helsinki.

### Statistical methods

#### Design of trial of GO with or without IL-11.

We used the method of Thall and Sung,4 which extends the method of Thall et al5 to randomized trials. Within each of the 2 prognostic groups—patients with normal cytogenetic findings (NC) and patients with abnormal cytogenetic findings (AC)—the trial was designed to select the better of the 2 treatments, GO with IL-11 and GO without IL-II, in terms of the rates of both early mortality (EM; death by day 49) and CR, while also monitoring the rates of these 2 events compared with their historical rates with IA, FA, and TA. These rates were 30% for EM and 36% for CR in the AC group and 25% for EM and 53% for CR in the NC group. For each treatment arm within each prognostic group, if either the EM rate was unacceptably high compared with a targeted 0.15 drop from the historical rate, or if the CR rate was unacceptably low, that arm was terminated early. Maximum sample sizes of 90 patients with AC and 80 patients with NC were used to achieve 95% posterior credibility intervals of a width of, at most, 0.25 for each outcome in each group. Assuming the historical accrual rate of 9 patients/week, enrollment of 90 patients would ensure that a conventional group-sequential, multiple-testing procedure with a size of .05 has a power of 0.80 to detect a doubling of median survival time from 13.6 to 27.2 weeks, in the AC group. A similar computation applied for the NC group.

#### Analysis.

Unadjusted survival probabilities were estimated by using the method of Kaplan and Meier.6 Unadjusted between-group comparisons of survival were done with the log-rank test.7 Logistic regression was used to assess the ability of patient characteristics or treatments to predict the probability of CR. The Cox proportional hazards regression model8 was used to assess the ability of patient characteristics or treatments to predict survival, with goodness of fit assessed by the Grambsch-Therneau test, Schoenfeld residual plots, martingale residual plots, and likelihood ratio statistics.9 All scatterplots were smoothed by using the lowess method of Cleveland,10 with predictive variables transformed as appropriate on the basis of these plots. Associations between pairs of binary variables were evaluated with the Fisher exact test. Computations were carried out in Splus11 by using standard Splus functions and the Splus survival analysis package of Therneau.12

The primary limitation of any inferences regarding a GO-IA treatment effect is that GO and IA were not arms of a randomized trial. Previously, we used the methods described in the preceding paragraph to account for prognosis and thus permit conclusions to be drawn about the efficacy of various treatments, with these treatments given in single-arm phase II trials.3,13-17 The underlying assumption was that, despite the sequential nature of these trials, TEs were inconsequential. We have recently begun to doubt this assumption.3 In particular, because GO and IA were studied in separate trials, the GO-IA treatment effect is confounded with the GO-IA TE and neither can be estimated individually from the available data. However, since the overall, confounded GO-IA TE plus treatment effect can be estimated, for a given assumed GO-IA TE, the treatment effect can be found by subtraction. Doing this for each of several reasonable TEs in turn yields corresponding reasonable treatment effects.

In both the survival and the logistic regression models, we accounted for the unknown GO-IA (trials 1 and 2) TE by assuming in turn that it was the same as (1) that estimated from 2 separate trials (trials 3 and 4) of fludarabine, IA, and G-CSF (FAIG; enrolling 36 and 24 patients at least 65 years old, respectively13) and (2) that estimated from 2 separate trials (trials 5 and 6) of FAIG plus ATRA (17 and 44 patients at least 65 years old, respectively13). As detailed in the , we estimated the FAIG and FAIG plus ATRA TEs and, with these, the confounded GO-IA trial-treatment effect by using data from trials 1 to 6, along with the covariate parameters corresponding to performance status, PE, cytogenetic findings and, within the GO trial, the effect of IL-11. The posterior mean for the FAIG TE was ± 0.396 (SD, 0.318) and the posterior mean for the FAIG plus ATRA TE was ± 0.328 (SD, 0.291). The sign of each TE's posterior mean may be either plus or minus. For example, the mean trial 3 versus trial 4 effect here was 0.396. Equivalently, the trial 4 versus trial 3 effect was −0.396. Therefore, both cases (± 0.396) must be considered. Although it is reasonable to assume that the GO-IA TE was similar to the historical FAIG and FAIG plus ATRA TEs, it is also possible that this was not the case. This motivated a sensitivity analysis in which we varied the unknown GO-IA TE over a wide range of possible values.

All statistical regression analyses to assess the sensitivity of possible GO-IA treatment effects to between-trial effects were based on Bayesian models.18 Consequently, each effect was a random quantity characterized by a probability distribution, rather than a single value. Computations of posterior distributions were carried out in BUGS 0.5.19 Dependence of patient survival on treatment and prognostic covariates was assessed by using a Weibull survival-time regression model. Uninformative priors having large variances were assumed for all parameters. Variable selection was done in a step-down fashion by computing the posterior probability of a positive effect, Prob (β > 0), for each covariate, where β is the covariate effect, and dropping any covariate for which this probability was between 0.10 and 0.90. Variables were dropped one at a time, with the variable having Prob (β > 0) closest to 0.50 dropped at each step, until all values of Prob (β > 0) were either more than 0.90 or less than 0.10. Logistic regression was used to assess the ability of patient characteristics or treatment to predict the probability of CR, assuming uninformative priors, and variable selection was carried out in the same manner as for the survival analysis.

## Results

### Trial of GO with or without IL-11

The median age of the 51 patients was 71 years (range, 65-89). Seven (14%) were at least partly bedridden (Zubrod performance status 3 or 4). Thirty-seven (73%) had AML, 6 (12%) had RAEB-t, and 8 (16%) had RAEB. An abnormality in blood count documented for at least 1 month before presentation at MD Anderson (antecedent hematologic disorder [AHD]) was present in 35 (69%). Twenty patients (39%) had NC and 31 had AC. Fifteen (29%) had monosomies of chromosomes 5 or 7 (or both) or deletions of the long arms of these chromosomes (or both); these are collectively referred to here as “−5/−7.” The remaining 31% of the patients had other abnormalities.

The CR rate was 11/51 (22%; exact 95% confidence interval [CI], 11%-35%) and was similar in patients given GO on days 1 and 15 and patients given GO on days 1 and 8 (5 of 22 versus 6 of 29;*P* = .86; 95% CI for the true difference in rates, −0.20-0.24), recalling that there was no randomization between these groups. Twenty-nine of the 51 patients are dead, with 4, 3, 4, and 8 of the deaths occurring in weeks 1, 2, 3, and 4, respectively. Currently, the longest interval between start of treatment and death is 6 months. The median follow-up time in the 22 patients remaining alive is 16 weeks (maximum, 1 year), and the median survival time for all 51 patients is 12 weeks (95% CI, 4 weeks to not applicable). There were no significant differences in hazard of mortality between patients treated on days 1 and 8 and patients treated on days 1 and 15 (*P* = .12). With either schedule, the causes of death were similar to those observed in previous trials in this population. The one noteworthy toxic effect was a clinical picture compatible with hepatic veno-occlusive disease; this occurred in 8 patients and was reported in detail previously.20 Disease recurred, at 4 and 10 weeks, in 2 of the 11 patients who had CR; one received GO on days 1 and 8 and the other on days 1 and 15. The median follow-up time in the 9 patients remaining in CR is only 18 weeks (maximum, 49 weeks).

Given the similarity in results between the day 1 and 8 and day 1 and 15 schedules, we combined the data obtained using these schedules for the randomized comparison of GO with IL and GO without IL-11. The CR rate was considerably higher in patients given GO with IL-11 than in patients given GO without IL-11 (Table 1; 9 of 25 versus 2 of 26; *P* = .02; 95% CI for the true difference in rates, 0.07-0.50). The higher CR rate with GO with IL-11 was due to results in the NC group and the group with “other” chromosome abnormalities (Table 1). The lower CR rate in the group given GO without IL-11 did not result from an excess of near-CRs with incomplete platelet recovery in this group, since no such events occurred in either arm. The 23 and 27 days from the start of treatment required to reach a platelet count above 100 × 10^{9}/L in the 2 patients achieving CR in the group treated with GO without IL-11 was well within the range of the analogous times in the group given GO with IL-11 (median, 31 days; range, 14-54 days). The higher CR rate with GO with IL-11 has not yet translated into a significantly longer survival either in the entire 51-patient group (Figure 1A) or in the patients with NC (Figure 1B). Any difference between the group given IL-11 and that not given IL-11 became most apparent only once, approximately 4 to 6 weeks after the start of therapy, ie, it does not appear solely to reflect a difference in early death rates.

Cytogenetic finding . | GO without IL-11 ^{*}
. | GO with IL-11 ^{†}
. | Total . |
---|---|---|---|

Normal | 2/9 (22) | 7/11 (64) | 9/20 (45) |

−5/−7 | 0/7 | 0/8 | 0/15 |

Other abnormal | 0/10 | 2/6 (33) | 2/16 (13) |

Total | 2/26 | 9/25 | 11/51 (22) |

Cytogenetic finding . | GO without IL-11 ^{*}
. | GO with IL-11 ^{†}
. | Total . |
---|---|---|---|

Normal | 2/9 (22) | 7/11 (64) | 9/20 (45) |

−5/−7 | 0/7 | 0/8 | 0/15 |

Other abnormal | 0/10 | 2/6 (33) | 2/16 (13) |

Total | 2/26 | 9/25 | 11/51 (22) |

Values are numbers (%) of patients.

Two of 20 patients with acute myeloid leukemia (AML), neither of 2 with refractory anemia (RA) with excess of blasts in transformation (RAEB-t), and none of 4 with RA with excess blasts (RAEB).

Seven of 17 patients with AML, 1 of 4 with RAEB-t, and 1 of 4 with RAEB.

Although treatment with GO without IL-11 and treatment with GO with IL-11 were arms of a randomized trial, thus eliminating any possible TEs in evaluating the effect of IL-11, the number of patients randomly assigned was small enough that imbalances in important prognostic covariates could have existed between the 2 arms. To address this possibility, we used logistic regression and a Weibull survival-time regression model to assess the relation between probability of CR or hazard of death and treatment (GO without IL versus GO with IL-11), age, performance status (Zubrod grade 0-2 versus Zubrod grade 3-4), cytogenetic findings (normal versus −5/−7 versus other), AHD, and treatment in a PE room. Results of these analyses indicated that administration of IL-11 was an independent predictor of CR (Table2) but not of survival (Table3). These analyses led us to consider GO without IL-11 and GO with IL-11 as separate treatments, with each to be compared with IA as remission-induction regimens. In contrast, GO with or without IL-11 was considered as one regimen for comparison to IA with respect to survival.

Variable . | Coefficient estimate . | Standard error . | P
. |
---|---|---|---|

Intercept | −3.025 | 0.815 | — |

Normal (rather than abnormal) karyotype | 2.639 | 0.926 | .004 |

Received interleukin 11 | 2.116 | 0.943 | .025 |

Variable . | Coefficient estimate . | Standard error . | P
. |
---|---|---|---|

Intercept | −3.025 | 0.815 | — |

Normal (rather than abnormal) karyotype | 2.639 | 0.926 | .004 |

Received interleukin 11 | 2.116 | 0.943 | .025 |

Variable . | Coefficient estimate . | Standard error . | P
. |
---|---|---|---|

Normal karyotype | −1.268 | 0.467 | .007 |

AHD present (rather than absent) | −1.078 | 0.433 | .013 |

Increasing age (numerical) | 0.102 | 0.044 | .019 |

Received interleukin 11 | 0.007 | 0.435 | .99 |

Variable . | Coefficient estimate . | Standard error . | P
. |
---|---|---|---|

Normal karyotype | −1.268 | 0.467 | .007 |

AHD present (rather than absent) | −1.078 | 0.433 | .013 |

Increasing age (numerical) | 0.102 | 0.044 | .019 |

Received interleukin 11 | 0.007 | 0.435 | .99 |

AHD indicates antecedent hematologic disorder.

The distribution of CD33 positivity was similar in patients treated with GO without IL-11 and those given GO with IL-11 (*P* = .73), with median values of 94% and 90% of blasts expressing CD33 in the 2 groups respectively. Likewise, the distribution of CD33 positivity was similar in patients having and not having CR in the group given GO with IL-11 (*P* = .28, with respective median values of 98% and 88%). The 2 patients achieving CR in the group given GO without IL-11 had CD33 expression on 100% and 99% of blasts; the median value in the nonresponders was 93%. CR rates according to morphologic category (AML versus MDS) and percentage of CD33 positivity (≥ 80% versus < 80%) were as follows: 7 of 29 patients with AML and 80% or greater, 2 of 8 with AML and less than 80%, 2 of 8 with MDS and 80% or greater, and none of 5 with MDS and less than 80%.

In terms of the trial's design, the posterior probabilities that, given the CR rates noted above, GO without IL-11 was worse than historical treatments were 0.999 and 0.96 in patients with AC and NC, respectively. The corresponding value for GO with IL-11 in the AC group was 0.96. The probability that GO with IL-11 would reduce the early mortality rate by 0.15 in the NC group was 0.21. These probabilities led to early closure of both arms of the trial in both the AC and NC groups.

### GO with or without IL-11 compared with IA: a Bayesian sensitivity analysis

The patients given GO with or without IL-11 and those given IA were similar in age (Table 4). Poor performance status was most common in patients given GO without IL-11 and least common in those given GO with IL-11. The patients given IA were least likely to have an AHD or to be treated in PE rooms. Patients given IA had a higher CR rate (15 of 31 [48%]) than patients given GO with or without IL-11 (9 of 25 [36%] and 2 of 26 [8%], respectively). The difference in survival (*P* = .03; Figure2) reflected deaths occurring beginning about 4 weeks after the start of treatment (Figure 2) and could not be accounted for by differences in survival among patients having CR (with survival dated from the date of CR) or among patients declared to have disease resistant to therapy (with survival dated from the date such resistance was recorded).

Variable . | GO without IL-11 (n = 26) . | GO with IL-11 (n = 25) . | IA^{4-150}(n = 31) . |
---|---|---|---|

Median age, y (range) | 72 (65-89) | 71 (65-78) | 72 (65-84) |

Performance status 3-4 | 6 (23) | 1 (4) | 4 (13) |

AML (versus MDS) | 20 (77) | 17 (68) | 24 (77) |

AHD | 18 (69) | 17 (68) | 15 (48) |

Normal cytogenetic findings | 9 (35) | 11 (44) | 12 (39) |

Cytogenetic finding of −5/−7 | 7 (27) | 8 (32) | 8 (26) |

Other cytogenetic abnormalities | 10 (38) | 6 (24) | 11 (35) |

Treated in laminar-airflow room | 17 (65) | 21 (84) | 14 (45) |

Complete remission | 2 (8) | 9 (36) | 15 (48) |

Median survival time, wk (95% CI) | 8 (4-NA) | 15 (5-NA) | 47 (23-116) |

Variable . | GO without IL-11 (n = 26) . | GO with IL-11 (n = 25) . | IA^{4-150}(n = 31) . |
---|---|---|---|

Median age, y (range) | 72 (65-89) | 71 (65-78) | 72 (65-84) |

Performance status 3-4 | 6 (23) | 1 (4) | 4 (13) |

AML (versus MDS) | 20 (77) | 17 (68) | 24 (77) |

AHD | 18 (69) | 17 (68) | 15 (48) |

Normal cytogenetic findings | 9 (35) | 11 (44) | 12 (39) |

Cytogenetic finding of −5/−7 | 7 (27) | 8 (32) | 8 (26) |

Other cytogenetic abnormalities | 10 (38) | 6 (24) | 11 (35) |

Treated in laminar-airflow room | 17 (65) | 21 (84) | 14 (45) |

Complete remission | 2 (8) | 9 (36) | 15 (48) |

Median survival time, wk (95% CI) | 8 (4-NA) | 15 (5-NA) | 47 (23-116) |

Values are numbers (%) of patients unless otherwise specified.

AML indicates acute myeloid leukemia; MDS, myelodysplastic syndrome; AHD, antecedent hematologic disorder; CI, confidence interval; and NA, not applicable.

CD33 was positive in all 24 of the 31 patients treated with IA in whom it was assessed (median, 96%; range, 36%-100%).

To begin the Bayesian sensitivity analyses, Table5 shows the posterior means, SDs, and resultant posterior probabilities that the covariates depicted in the table, including use of GO with or without IL-11 rather than IA, were associated with shorter survival independently of the other covariates in the table. Specifically, after accounting for the prognostic covariates in the table, the posterior probability that the combined treatment effect and TE of GO with or without IL-11 was inferior to IA was 0.995. Because the effect estimated here was the confounded GO-IA trial-treatment effect, however, it is not immediately apparent how much of this effect was due to treatment and how much was a TE.

Covariate . | Posterior mean (SD) . | Posterior probability of harmful effect . |
---|---|---|

Performance status 3 or 4 | 0.672 (0.241) | 0.996 |

Treatment in laminar-airflow room | −0.969 (0.206) | 0.000 |

Cytogenetic finding of −5/−7 | 1.230 (0.239) | 1.000 |

Other cytogenetic abnormalities | 0.679 (0.234) | 0.998 |

Treatment with GO with or without IL-11 | 0.823 (0.323) | 0.995 |

Covariate . | Posterior mean (SD) . | Posterior probability of harmful effect . |
---|---|---|

Performance status 3 or 4 | 0.672 (0.241) | 0.996 |

Treatment in laminar-airflow room | −0.969 (0.206) | 0.000 |

Cytogenetic finding of −5/−7 | 1.230 (0.239) | 1.000 |

Other cytogenetic abnormalities | 0.679 (0.234) | 0.998 |

Treatment with GO with or without IL-11 | 0.823 (0.323) | 0.995 |

Table 6 illustrates the method used to estimate how much of this confounded GO effect might be attributed to an effect of treatment. The table is concerned with survival in patients with a normal karyotype. Starting with the posterior mean, 0.823, of the confounded GO effect shown in Table 5, we subtracted the posterior means of the TEs assumed from the FAIG and FAIG plus ATRA trials. Because there were 2 TEs and, as explained earlier, the mean of each could have either a plus or minus sign, there are 4 of these in all, corresponding to the 4 rows in the table. These were a positive and a negative effect based on the 2 FAIG trials (± 0.396) and a positive and a negative effect based on the 2 FAIG plus ATRA trials (± 0.328). Assuming in turn that each of these 4 possible TE distributions was the GO-IA TE, the posterior distribution of the GO-IA treatment effect was derived. From this, the 4 corresponding posterior probabilities, Prob (β_{GO-IA} > 0 given data), that GO with or without IL-11 was inferior to IA was calculated. These posterior probabilities ranged from 0.832 to 0.997 (Table 6, last column), depending on the magnitude and sign of the assumed TE. The GO-IA treatment effect in various patient prognostic subgroups may be computed similarly. For example, by assuming the posterior means noted in Table 5 for −5/−7 or for the other cytogenetic abnormalities, posterior probabilities that GO with or without IL-11 is inferior to IA in patients with these cytogenetic patterns were calculated. Table 7illustrates that these ranged from 0.9997 to 1.0 in patients with −5/−7 and from 0.988 to 1.0 in patients with other abnormalities.

Confounded GO/TE, posterior mean (SD) . | Assumed TE, posterior mean (SD) . | GO effect, posterior mean (SD) . | Posterior probability that GO with or without IL-11 was inferior to IA . |
---|---|---|---|

0.823 (0.323) | 0.396 (0.318) | 0.427 (0.444) | 0.832 |

0.823 (0.323) | −0.396 (0.318) | 1.219 (0.444) | 0.997 |

0.823 (0.323) | 0.328 (0.291) | 0.495 (0.444) | 0.868 |

0.823 (0.323) | −0.328 (0.291) | 1.151 (0.444) | 0.995 |

Confounded GO/TE, posterior mean (SD) . | Assumed TE, posterior mean (SD) . | GO effect, posterior mean (SD) . | Posterior probability that GO with or without IL-11 was inferior to IA . |
---|---|---|---|

0.823 (0.323) | 0.396 (0.318) | 0.427 (0.444) | 0.832 |

0.823 (0.323) | −0.396 (0.318) | 1.219 (0.444) | 0.997 |

0.823 (0.323) | 0.328 (0.291) | 0.495 (0.444) | 0.868 |

0.823 (0.323) | −0.328 (0.291) | 1.151 (0.444) | 0.995 |

Assumed trial effect, posterior mean (SD) . | Posterior probability that GO with or without IL-11 was inferior to IA . | ||
---|---|---|---|

Normal cytogenetic findings . | Cytogenetic finding of −5/−7 . | Other cytogenetic abnormalities . | |

0.396 (0.318) | 0.832 | 0.9997 | 0.988 |

−0.396 (0.318) | 0.997 | 1.000 | 0.999 |

0.328 (0.291) | 0.868 | 0.998 | 0.991 |

−0.328 (0.291) | 0.995 | 1.000 | 1.000 |

Assumed trial effect, posterior mean (SD) . | Posterior probability that GO with or without IL-11 was inferior to IA . | ||
---|---|---|---|

Normal cytogenetic findings . | Cytogenetic finding of −5/−7 . | Other cytogenetic abnormalities . | |

0.396 (0.318) | 0.832 | 0.9997 | 0.988 |

−0.396 (0.318) | 0.997 | 1.000 | 0.999 |

0.328 (0.291) | 0.868 | 0.998 | 0.991 |

−0.328 (0.291) | 0.995 | 1.000 | 1.000 |

The method used to separate treatment effects and TEs for analysis of CR was entirely analogous, with the only exception being that here we also had to account for the IL-11 effect (Table 2). The covariates not related to treatment were the same as those found predictive of survival, although with quantitatively different posterior means and SDs. After accounting for these covariates, the posterior probability that treatment with GO without IL-11 was associated with a lower CR rate than treatment with IA was 0.999, and the corresponding probability for GO with IL-11 was 0.551. Table8 illustrates the posterior probabilities for GO without IL-11 and GO with IL-11 in the various cytogenetic groups, assuming the same 4 types of TEs as in the survival analysis, although again with quantitatively posterior different means and SDs. In patients with −5/−7, both GO with IL-11 and GO without IL-11 were highly likely to be inferior to IA; the same was true for GO without IL-11 in patients with other abnormalities or a normal karyotype. The only exception to the likely inferiority of GO occurred in patients with a normal karyotype who received GO with IL-11. For this prognostic subgroup, the posterior probabilities ranged from 0.446 to 0.631, indicating that it was about as likely that GO with IL-11 was superior to IA as inferior to it with regard to the probability of achieving CR.

Assumed trial effect, posterior mean (SD) . | Posterior probability that GO with or without IL-11 was inferior to IA . | |||||
---|---|---|---|---|---|---|

Normal cytogenetic findings . | Cytogenetic finding of −5/−7 . | Other cytogenetic abnormalities . | ||||

GO − IL-11 . | GO + IL-11 . | GO − IL-11 . | GO + IL-11 . | GO − IL-11 . | GO + IL-11 . | |

0.233 (0.629) | 0.996 | 0.631 | 1.0 | 0.980 | 0.999 | 0.868 |

−0.233 (0.629) | 0.987 | 0.446 | 1.0 | 0.949 | 0.997 | 0.755 |

0.156 (0.618) | 0.995 | 0.601 | 1.0 | 0.977 | 0.999 | 0.852 |

−0.156 (0.618) | 0.989 | 0.477 | 1.0 | 0.956 | 0.998 | 0.776 |

Assumed trial effect, posterior mean (SD) . | Posterior probability that GO with or without IL-11 was inferior to IA . | |||||
---|---|---|---|---|---|---|

Normal cytogenetic findings . | Cytogenetic finding of −5/−7 . | Other cytogenetic abnormalities . | ||||

GO − IL-11 . | GO + IL-11 . | GO − IL-11 . | GO + IL-11 . | GO − IL-11 . | GO + IL-11 . | |

0.233 (0.629) | 0.996 | 0.631 | 1.0 | 0.980 | 0.999 | 0.868 |

−0.233 (0.629) | 0.987 | 0.446 | 1.0 | 0.949 | 0.997 | 0.755 |

0.156 (0.618) | 0.995 | 0.601 | 1.0 | 0.977 | 0.999 | 0.852 |

−0.156 (0.618) | 0.989 | 0.477 | 1.0 | 0.956 | 0.998 | 0.776 |

Thus far, we have assumed that the GO-IA TE was distributed identically to one of those observed with either the 2 FAIG trials or the 2 FAIG plus ATRA trials. However, it is also plausible that the magnitude of the GO-IA TE differed from these previous TEs. Because of this possibility, we performed sensitivity analyses in which the distribution of the assumed GO-IA trial effect was varied over a wider range of possibilities. Figure 3A-C shows the analyses for survival. On the horizontal axis in each figure, we plotted hypothetical values for the mean GO-IA TE. The solid vertical lines correspond to the mean TEs assumed earlier, on the basis of the FAIG and FAIG plus ATRA trials (± 0.396 and ± 0.328, respectively). The plotted curves are the posterior probabilities that GO with or without IL-11 is harmful relative to IA (vertical axis) as a function of the hypothesized mean GO-IA TE. For example, Figure 3A (normal-karyotype group) shows that the mean TEs 0.396 and 0.328 produced the posterior probabilities of 0.832 and 0.868 noted in Table7. It can be seen that for GO with or without IL-11 to be superior to IA, corresponding to a posterior probability of less than 0.5, it would be necessary to postulate a mean TE greater than 0.82, which is more than twice the magnitude of that observed in the FAIG or FAIG plus ATRA trials (Figure 3A, dotted lines). Furthermore, this TE must be in the “correct” direction, ie, latent covariates must have led patients given IA to have more, rather than less, favorable prognoses than patients given GO with or without IL-11. The situation becomes even more extreme in the analysis of −5/−7 and other abnormal groups (Figure 3B and 3C, respectively). For GO to be superior in patients with −5/−7, TEs at least 5 times the magnitude of those observed in either the FAIG or FAIG plus ATRA trials must be postulated; for those with other abnormalities, 4-fold greater TEs must be postulated. Of course, these effects would again also have to be in the correct direction, as noted above.

## Discussion

Standard 3-plus-7 regimens are not only usually unsuccessful in older patients with newly diagnosed AML but frequently produce considerable toxic effects, including those resulting in early death.2 Because of the recent approval of GO for use in elderly patients with relapsed AML, it would not be surprising if many physicians are tempted to use this agent in older patients with newly diagnosed AML. The principal point of this paper is that such use does not appear advisable. In particular, tables 7 and 8 indicate that, under reasonable assumptions, the probability is at least 83% that GO without IL-11 is inferior to our IA regimen with regard to both survival and CR, with the 83% pertaining only to patients with a normal karyotype and only if it is assumed that latent covariates (TEs) made such GO-treated patients have a worse prognosis than otherwise apparent. Although addition of IL-11 to GO improved outcome, this combination appeared to have an effect similar to that of IA only with respect to CR in patients with a normal karyotype. Even in these patients, its effect on survival was very likely unfavorable, and indeed this arm of the trial was stopped early in these patients because the early mortality rate was unlikely to be measurably improved. GO with or without IL-11 yielded particularly worse results compared with IA in patients with AC, especially patients with abnormalities of chromosome 5, 7, or both. Although only small numbers of patients with MDS were treated, results in RAEB and RAEB-t paralleled those in AML (Table 1 and Figure 1A legend).

The sensitivity analysis indicated that TEs several times the magnitude of those observed in our previous trials and operating to favor GO would have to be assumed to make it plausible that treatment with IA and treatment with GO with or without IL-11 were equivalent (Figure 3A-C). Of course, it could be argued that there is no reliable method for estimating TEs such as those possibly arising between the trials of IA and of GO with or without IL-11. To postulate such a method is to imply that there is no need to randomize, a position with which we disagree. Indeed, in retrospect, we are not unsympathetic to the view that our randomization should have been between GO and IA rather than between GO with IL-11 and GO without IL-11.

We note that follow-up is short and, in particular, relapses have occurred in only 2 of the 11 patients treated with GO with or without IL-11 who had CR. However, 45 of the 82 patients have died (Figure 2), and this prompted us to report these results. It is noteworthy that our CR rate with GO in elderly patients with newly diagnosed disease is lower than the CR rate reported in elderly patients with relapsed AML.1 We speculate that this may reflect the need for patients with relapse to have had a remission lasting at least 3 months. As a result, they may have had better prognoses than our newly diagnosed patients. Certainly, however, the discrepancy suggests that our results may not be generalizable, as does (with respect to the comparison of GO with and without IL-11) the observation (Table 3), which contrasts with previous observations by us and others, that an AHD is associated with longer survival in the data set for GO with or without IL-11. Finally, we emphasize that assignment to the day 1 and 8 and day 1 and 15 schedules for GO was not random, hindering scientific comparisons of these schedules.

We have no explanation for the apparently poor results with GO. Although there was no relation between CD33 expression and response, our measurements of CD33, although consistent with those available to most physicians, were relatively crude. Thus, it is conceivable that responders and nonresponders to GO differed in the amount of CD33 expressed per cell, a difference that we would have been unable to detect. Nor did we examine the relation between multidrug resistance 1 (MDR1) status and response, although it has been reported that MDR1 expression unfavorably affects the response rate to GO.21Lastly, we do not know why addition of IL-11 to GO improved the CR rate. Because this effect appeared unrelated to an effect on platelet recovery, it is possible that IL-11 has an antileukemic effect, a possibility that might be investigated in trials involving other drugs.

Despite the uncertainties noted above, we conclude that there is insufficient evidence to warrant use of GO with or without IL-11 in patients 65 years of age or older with newly diagnosed AML, RAEB-t, or RAEB.

We thank Angela Culler for expert secretarial assistance.

Trial . | Treatment combination . | Model linear component . |
---|---|---|

1 | IDA and ara-C (IA) | μ + Zβ − (τ_{1} + β_{GO-IA})/2 |

2 | GO with or without IL-11 | μ + Zβ + (τ_{1} + β_{GO-IA})/2 |

3 | FAIG | μ + Zβ − τ_{2}/2 |

4 | FAIG | μ + Zβ + τ_{2}/2 |

5 | FAIG + ATRA | μ + Zβ − τ_{3}/2 |

6 | FAIG + ATRA | μ + Zβ + τ_{3}/2 |

Trial . | Treatment combination . | Model linear component . |
---|---|---|

1 | IDA and ara-C (IA) | μ + Zβ − (τ_{1} + β_{GO-IA})/2 |

2 | GO with or without IL-11 | μ + Zβ + (τ_{1} + β_{GO-IA})/2 |

3 | FAIG | μ + Zβ − τ_{2}/2 |

4 | FAIG | μ + Zβ + τ_{2}/2 |

5 | FAIG + ATRA | μ + Zβ − τ_{3}/2 |

6 | FAIG + ATRA | μ + Zβ + τ_{3}/2 |

IDA indicates idarubicin; ara-C, cytosine arabinoside; GO, gemtuzumab ozogamicin; IL-11, interleukin 11; FAIG, fludarabine, IA, and granulocyte colony-stimulating factor; ATRA, all-*trans*retinoic acid; τ_{1}, effect due to trial 2 versus trial 1, the unknown GO-IA trial effect; τ_{2}, effect due to trial 4 versus trial 3, the FAIG trial effect; τ_{3}, effect due to trial 6 versus trial 5, the FAIG + ATRA trial effect; β_{GO-IA}, effect due to GO versus IA, the unknown GO-IA treatment effect; Zβ, the overall effect due to patient prognostic covariates; μ grand mean; and τ_{1} + β_{GO-IA}, confounded GO-IA trial-treatment effect.

Supported in part by a grant from Wyeth-Ayerst Pharmaceuticals.

The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement*” in accordance with 18 U.S.C. section 1734.*

## References

## Author notes

Elihu H. Estey, Department of Leukemia, Box 428, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, TX 77030; e-mail: ehestey@mdanderson.org.

## This feature is available to Subscribers Only

Sign In or Create an AccountClose Modal