Interaction-based analyses generated the RICE score, which is the sum of 3 factors including age, comorbidities, and umbilical cord blood.
The RICE score successfully extracts the population in whom reduced-intensity conditioning reduces the risk of NRM.
Reduced-intensity conditioning (RIC) regimens have long-term outcomes that are generally comparable with those of myeloablative conditioning (MAC) because of a lower risk of nonrelapse mortality (NRM) but a higher risk of relapse. However, it is unclear how we should select the conditioning intensity in individual cases. We propose the risk assessment for the intensity of conditioning regimen in elderly patients (RICE) score. We retrospectively analyzed 6147 recipients aged 50 to 69 years using a Japanese registry database. Based on the interaction analyses, advanced age (≥60 years), hematopoietic cell transplantation–specific comorbidity index (≥2), and umbilical cord blood were used to design a scoring system to predict the difference in an individual patient's risk of NRM between MAC and RIC: the RICE score, which is the sum of the 3 factors. Zero or 1 implies low RICE score and 2 or 3, high RICE score. In multivariate analyses, RIC was significantly associated with a decreased risk of NRM in patients with a high RICE score (training cohort: hazard ratio [HR], 0.73; 95% confidence interval [CI], 0.60-0.90; P = .003; validation cohort: HR, 0.57; 95% CI, 0.43-0.77; P < .001). In contrast, we found no significant differences in NRM between MAC and RIC in patients with a low RICE score (training cohort: HR, 0.99; 95% CI, 0.85-1.15; P = .860; validation cohort: HR, 0.81; 95% CI, 0.66-1.01; P = .061). In summary, a new and simple scoring system, the RICE score, appears to be useful for personalizing the conditioning intensity and could improve transplant outcomes in older patients.
Allogeneic hematopoietic cell transplantation (HCT) is a potentially curative option for various hematologic malignancies. Although traditional myeloablative conditioning (MAC) regimens have high antitumor effects, they limit the application of HCT to patients with advanced age and comorbid conditions because of their considerable toxicity and high nonrelapse mortality (NRM). The development of reduced-intensity conditioning (RIC) has made it possible to offer HCT to patients who cannot undergo a MAC regimen.1-11
In general, previous prospective and retrospective studies suggested that RIC regimens were associated with an increased risk of relapse and a decreased risk of NRM, which resulted in equivalent survival outcomes.12-14 These findings represent the overall average effects of conditioning intensity in all patients, whereas they might vary among patients with different baseline characteristics. For example, the benefit of RIC regimens could be enhanced by multiple factors, such as age or comorbidities, which are associated with NRM.15,16 The randomized controlled trial by Rambaldi et al17 that compared conditioning intensity in patients aged 40 to 65 years with acute myeloid leukemia (AML) demonstrated that RIC regimens reduced NRM, especially for patients with higher HCT-specific comorbidity index (HCT-CI) scores. Our previous study on older patients with Philadelphia chromosome–positive acute lymphoblastic leukemia (ALL) who achieved negative minimal residual disease suggested that RIC regimens are associated with superior overall survival (OS) because of a lower risk of NRM when patients had high HCT-CI scores.18 However, because it is unclear how we should select the conditioning intensity for each individual patient, this clinical decision making depends on the physician or institutional preference. In this Japanese nationwide retrospective study that included patients aged between 50 and 69 years with whom physicians often face a dilemma in selecting a conditioning regimen, we propose a novel scoring system to highlight the population in whom RIC reduces the risk of NRM more than MAC does.
Data source and patient selection
Clinical data were obtained from the Transplant Registry Unified Management Program, which is the registry database of the Japan Society for Transplantation and Cellular Therapy.19 All patients provided their signed informed consent locally at the time of HCT. This retrospective cohort study included patients aged from 50 to 69 years with AML or ALL in the first or second complete remissions, or with myelodysplastic syndrome (MDS), who underwent their first allogeneic HCT from HLA-matched related donors (MRD), HLA-matched unrelated donors (MUD), HLA-mismatched unrelated donors (MMUD), or umbilical cord blood (UCB) between 2008 and 2019. Cases with a haploidentical donor HCT using in vivo T-cell depletion or posttransplant cyclophosphamide were excluded because of the limited sample size (n = 469) during the study period. Ultimately, we identified 6147 patients who fulfilled the eligibility criteria.
This study was performed in accordance with the Declaration of Helsinki and was approved by the data management committee of the Japan Society for Transplantation and Cellular Therapy and the institutional review board of Jichi Medical University Saitama Medical Center.
The intensity of the conditioning regimen was determined as either MAC or RIC per consensus of the Center for International Blood and Marrow Transplant Research criteria: MAC regimens were defined by total body irradiation (TBI) of ≥5 Gy (single dose) or ≥8 Gy (fractionated), or a busulfan (Bu) dose of ≥9 mg/kg oral or of ≥7.2 mg/kg intravenously (IV), or melphalan >140 mg/kg IV, whereas conditioning regimens that did not meet the MAC regimens were classified as RIC.20 Related donors with a 6/6 antigen match of HLA-A, -B, and -DR were considered to be MRD. MUD and MMUD were defined using donor-recipient pairs matched at the allele level at HLA-A, -B, -C, and -DRB1. Disease risk index (DRI) and HCT-CI scores were calculated as previously described.21,22 The use of antithymocyte globulin or alemtuzumab as graft-versus-host disease (GVHD) prophylaxis were considered to be in vivo T-cell depletion.
The primary end point was NRM and the secondary end points were OS and relapse. NRM was defined as death without relapse. Method of probability estimation by Gray was used to estimate the probabilities of NRM and relapse. The Fine and Gray method was used to evaluate the impact of the conditioning intensity on NRM and relapse. Competing events were relapse for NRM and death without relapse for relapse. The Kaplan-Meier method and Cox proportional hazards regression models were used to evaluate the impact of the conditioning intensity on OS.
The risk assessment for the intensity of the conditioning regimen in elderly patients (RICE) score was developed as follows: first, to generate and validate the RICE score, we randomly divided the cohort with complete information of all covariates into training and validation cohorts in a 2:1 ratio23; second, we selected the potential covariates that interacted with the conditioning intensity (MAC vs RIC) on NRM in the training cohort and established the RICE score. In this process, we included each interaction term 1 at a time in the multivariate models and identified significant interaction terms based on the P value (≤ .10). To assign the weights for the RICE score, in the final model, we included all selected interaction terms in addition to recipient’s age at HCT and HCT-CI, which are theoretically important interaction terms.17,18 Weights for the RICE score were determined by the hazard ratio (HR) of each interaction term. Finally, we compared the MAC and RIC regimens stratified according to low and high RICE scores in the validation cohort.
In addition to the conditioning intensity (MAC vs RIC), the following covariates were included in the multivariate analyses: recipient’s age at HCT (<60 vs ≥60), sex mismatch (female to male vs others), disease (AML vs ALL vs MDS), DRI (low or intermediate risk vs high risk), Karnofsky performance status (KPS) (≤80% vs >80%), HCT-CI (<2 vs ≥2), donor source (MRD vs MUD vs MMUD vs UCB), GVHD prophylaxis (cyclosporine vs tacrolimus based), in vivo T-cell depletion (no vs yes), and the year of receiving HCT. When we determined the interaction between donor source (bone marrow vs peripheral blood) and conditioning intensity, UCB was excluded from the model. Although other thresholds for recipient’s age at HCT (55 and 65 years) were examined, a cutoff of 60 years was selected because of the smallest interaction P value.
Two-sided P values ≤ .05 were considered to reflect statistical significance. All statistical analyses were performed with EZR version 1.53 (Saitama Medical Center, Jichi Medical University), which is a graphical user interface for R (The R Foundation for Statistical Computing, version 3.2.2, Vienna, Austria).24
Establishment of the RICE score using the training cohort
In the training cohort, 2223 (54.3%) and 1873 (45.7%) patients received MAC and RIC regimens, respectively (Table 1). The median age at HCT was 57 years (range, 50-69 years) and 61 years (range, 50-69 years) in MAC and RIC, respectively (P < .001). Of 2223 and 1873 patients who received MAC and RIC, respectively, 298 (13.4%) and 206 (11.0%) were relatively younger and older (aged 50-54 and 65-69), respectively. Disease, HCT-CI, donor type, donor source, GVHD prophylaxis, and the year of receiving HCT were significantly different between the MAC and RIC groups. A fludarabine/Bu-based regimen was the most common in the MAC group, followed by cyclophosphamide/TBI-based and cyclophosphamide/Bu-based regimens. The most common RIC regimen was fludarabine/Bu based or fludarabine/melphalan based. The median observation period of the survivors was 3.3 and 3.6 years in the MAC and RIC groups, respectively.
The cumulative incidence of NRM at 4 years was 28.9% (95% confidence interval [CI], 26.9-31.0) and 28.5% (95% CI, 26.3-30.7) in the MAC and RIC groups, respectively (P = .654). In a multivariate analysis, RIC had marginally lower NRM compared with MAC, but this difference was not statistically significant (HR, 0.90; 95% CI, 0.79-1.01; P = .080) (Table 2). Age, sex mismatch, disease, KPS, HCT-CI, donor type, GVHD prophylaxis, and the year of receiving HCT significantly affected the risk of NRM.
Next, we determined the interaction between conditioning intensity and each covariate. An interaction term was included 1 at a time in a multivariate model and we identified UCB (P = .087) as a significant interaction term (Table 2). Because adding UCB to the multivariate model, which consisted of the 2 predetermined theoretically important interaction terms including recipient’s age at HCT and HCT-CI, decreased the Akaike information criterion from 18 101.23 to 18 101.1, these 3 factors were used to design a final model, the RICE score, to predict the difference in an individual patient's risk of NRM between MAC and RIC. We assigned the weight for the RICE score based on the final model. HRs of interaction term was 0.83 (95% CI, 0.64-1.07; P = .150) in UCB, 0.85 (95% CI, 0.67-1.10; P = .210) in HCT-CI, and 0.88 (95% CI, 0.69-1.12; P = .300) in age. Therefore, the RICE score is calculated by summing the number of factors present at HCT (age [≥60 years], HCT-CI [≥2], and UCB). Based on their number of factors, patients were then assigned to either of 2 groups: 0 or 1 implied low RICE score and 2 or 3, high RICE score.
Testing the RICE score
In the validation cohort, 1098 (53.5%) and 953 (46.5%) patients received MAC and RIC regimens, respectively (Table 1). The median age at HCT was 58 years (range, 50-69 years) and 60 years (range, 50-69 years) in MAC and RIC (P < .001), respectively. Disease, KPS, donor type, donor source, and the year of receiving HCT were significantly different between the MAC and RIC groups. The median observation period of the survivors was 3.2 and 3.4 years in the MAC and RIC groups, respectively. There were no differences in patient characteristics between the training and validation groups (supplemental Table 1).
In total, 1307 (31.9%) and 642 (31.3%) patients were considered to have a high RICE score in the training and validation cohorts, respectively (supplemental Table 2). In the training cohort, the cumulative incidence of NRM at 4 years in patients with a low RICE score was 26.5% (95% confidence interval [CI], 24.3-28.8) vs 27.9% (95% CI, 25.1-30.8) (P = .442), and NRM in patients with a high RICE score was 36.5% (95% CI, 32.1-40.9) vs 29.4% (95% CI, 25.9-33.0) (P = .005) between the MAC and RIC groups (Figure 1A-B). Similarly, in the validation cohort, the cumulative incidence of NRM at 4 years in patients with a low RICE score was 28.8% (95% CI, 25.4-32.2) vs 25.6% (95% CI, 21.9-29.5) (P = .194), and NRM in those with a high RICE score was 39.5% (95% CI, 33.1-45.8) vs 26.7% (95% CI, 22.0-31.5) (P = .001) between the MAC and RIC groups (Figure 1C-D). In analyses limited to the RICE score = 1, RIC did not reduce the risk of NRM compared with MAC in both the training and validation cohorts (supplemental Figure 1).
There were no significant differences in relapse between the MAC and RIC groups regardless of the RICE score (supplemental Figure 2). In the training cohort, the 4-year OS in patients with a low RICE score was 54.6% (95% CI, 51.9-57.2) vs 53.3% (95% CI, 50.0-56.4) (P = .580), and OS in those with a high RICE score was 41.1% (95% CI, 36.5-45.7) vs 48.9% (95% CI, 44.9-52.8) (P < .001) between the MAC and RIC groups (Figure 2A-B). In the validation cohort, the 4-year OS in patients with a low RICE score was 56.3% (95% CI, 52.5-55.8) vs 55.8% (95% CI, 51.3-60.1) (P = .974), and OS in those with a high RICE score was 42.0% (95% CI, 35.3-48.4) vs 50.4% (95% CI, 44.6-55.8) (P = .031) between the MAC and RIC groups (Figure 2C-D).
In multivariate analyses, RIC was significantly associated with a decreased risk of NRM in patients with a high RICE score (training cohort: HR, 0.73; 95% CI, 0.60-0.90; P = .003; validation cohort: HR, 0.57; 95% CI, 0.43-0.77; P < .001), but no significant differences were observed in patients with a low RICE score (training cohort: HR, 0.99; 95% CI, 0.85-1.15; P = .860; validation cohort: HR, 0.81; 95% CI, 0.66-1.01; P = .061) (Table 3). In contrast, RIC was significantly associated with an increased risk of relapse in patients with a high RICE score only in the validation cohort (HR, 1.47; 95% CI, 1.02-2.12; P = .037) (Table 3). Similar to NRM, RIC was significantly associated with a superior OS in patients with a high RICE score (training cohort: HR, 0.76; 95% CI, 0.65-0.90; P = .001; validation cohort: HR, 0.79; 95% CI, 0.63-0.99; P = .048), but no significant differences were seen in patients with a low RICE score (training cohort: HR, 0.96; 95% CI, 0.85-1.08; P = .457; validation cohort: HR, 0.93; 95% CI, 0.79-1.11; P = .422) (Table 3).
Next, the training cohort and validation cohort were combined to validate the robustness of the RICE score in alternative cohorts. In the multivariate analysis limited to either AML (n = 2818), ALL (n = 1140), or MDS (n = 2189), RIC was significantly associated with a lower risk of NRM compared with MAC in patients with a high RICE score (AML: HR, 0.73; 95% CI, 0.56-0.94; P = .016; ALL: HR, 0.59; 95% CI, 0.38-0.90; P = .015; MDS: HR, 0.67; 95% CI, 0.52-0.87; P = .002), but NRM in patients with a low RICE score was not different between MAC and RIC (AML: HR, 0.99; 95% CI, 0.82-1.20; P = .910; ALL: HR, 0.84; 95% CI, 0.62-1.12; P = .230; MDS: HR, 0.93; 95% CI, 0.76-1.13; P = .470) (supplemental Figures 3-5). In the multivariate analysis limited to a fludarabine/Bu-based regimen, which is a common conditioning in the MAC and RIC groups (n = 3183, median Bu dose in the MAC and RIC was 12.8 and 6.4 mg/kg, respectively), RIC was also related to a lower NRM in patients with a high RICE score (HR, 0.62; 95% CI, 0.49-0.80; P < .001) but not in patients with a low RICE score (HR, 0.85; 95% CI, 0.71-1.02; P = .080) (supplemental Figure 6).
In addition, we evaluated the cause of nonrelapse death stratified by the RICE score. Regardless of the RICE score, the profiles of the cause of nonrelapse death were not significantly different between MAC and RIC (low RICE score: P = .444; high RICE score: P = .226) (supplemental Table 3).
Although, over the past few decades, the use of RIC has dramatically increased in patients that are older, frail, or that have comorbidities, and even in those who are considered to be able to tolerate MAC, optimization of the conditioning intensity is challenging.12-14 Several prospective randomized trials have compared MAC with RIC in patients with AML or MDS.6,17,25-28 However, the limited sample size in subgroup analyses did not have enough power to detect effect modification based on patient characteristics such as age, performance status, or comorbidities. In contrast, retrospective observation studies comparing MAC with RIC have potential selection biases. Therefore, to minimize selection bias, we included only patients aged 50 to 69 years who are possible candidates for both MAC and RIC regimens in Japan. As a result, this study generated the RICE score, which consisted of 3 factors: advanced age (≥60 years), HCT-CI (≥2), and UCB. Finally, we found that RIC was associated with a decreased risk of NRM compared with MAC only when patients had a high RICE score (≥2), which was confirmed in the validation cohort.
No significant differences in NRM were seen among patients with a RICE score of 1. In other words, RIC is considered to be a preferable regimen only if 2 or 3 of the factors overlapped, indicating that a clinical decision should not be made solely based on a single factor. This is a reasonable conclusion because experienced physicians usually select the conditioning intensity by taking into consideration multiple factors. Indeed, several studies have recommended that clinical decisions should not be made solely based on the recipient’s age.1,3,29 The important point of the this study is that objective statistical analyses proposed an alternative method for selecting the conditioning intensity that totally depends on the insight of transplantation physicians, which is simple and easy to use for all clinicians.
Our scoring system was made to detect the difference in NRM between MAC and RIC but was not designed to predict a difference in relapse. The prospective Blood and Marrow Transplant Clinical Trials Network (BMT-CTN) study showed that MAC was associated with better survival in patients with a high risk of relapse and without comorbidities,27 and the Center for International Blood and Marrow Transplant Research study also suggested that the conditioning intensity had different effects on transplant survival stratified by DRI.30 Because many studies have suggested that MAC is associated with less relapse,7,8,31-40 selection of the conditioning intensity should be balanced between the risk of NRM and relapse. For example, widespread adoption of minimal residual disease testing including polymerase chain reaction, flow cytometry, and next-generation sequencing in various hematologic disorders has dramatically changed clinical practice regarding predicting relapse and monitoring the therapeutic efficacy.41-44 Hourigan et al45 previously demonstrated that MAC rather than RIC in patients with positive results of minimal residual disease before HCT results in lower relapse and improved survival. However, given the absence of standardization and different sensitivity for each minimal residual disease detection technique, it is currently difficult to include and analyze minimal residual disease data in multicenter registry studies. Although future studies need to make a scoring system that takes into consideration the risk of relapse such as minimal residual disease and/or molecular profiles of tumor cells, we believe that the current scoring system should contribute to clinical decisions regarding the selection of conditioning regimens.
This study has other limitations. First, although we roughly divided conditioning regimens into MAC and RIC based on the conventional criteria,20 there should be some difference in regimen toxicity even in the same category of MAC or RIC. In addition, different toxicity profiles of each conditioning agent or TBI might require additional assessments per individual conditioning regimen to personalize clinical decisions.46 Second, interaction terms between the conditioning intensity and 3 covariates were not statistically significant in the final model because the interaction analysis often lacks sensitivity. However, despite the lack of enough power to detect interaction, the RICE score was able to extract the population in whom RIC reduces the risk of NRM. Third, we could not include patients who underwent haploidentical HCT using in vivo T-cell depletion or posttransplant cyclophosphamide because of the limited sample size. Instead, HCT from UCB is increasing in Japan and accounted for ∼30% of the cases in this cohort, which is a less-common donor source in western countries.9 In addition to donor selection, there are several differences in patient baseline characteristics such as a lower rate of comorbidities and genetic homogeneity in the Japanese population.47-50 Therefore, the RICE score should be validated in independent cohorts or similar scoring tools need to be made using the same methodology for different regions. Nevertheless, we believe the current interaction analysis–based scoring system can promote the strategy used to select conditioning regimens.
In summary, we have developed the RICE score, which could identify patients who could be expected to receive significant benefits regarding NRM if they underwent HCT with RIC. This simple and validated scoring system may help us to choose appropriate conditioning regimens and improve transplant outcomes.
The authors greatly appreciate the contributions of many physicians and data managers throughout the Japan Society for Transplantation and Cellular Therapy (JSTCT), the Japan Marrow Donor Program (JMDP), and the Japan Cord Blood Bank Network (JCBBN), who made this analysis possible. The authors thank the members of the Transplant Registry Unified Management committees at JSTCT, JMDP, and JCBBN for their dedicated management of data. Y. Akahoshi is a recipient of the Japan Society for the Promotion of Science Postdoctoral Fellowship for Research Abroad.
Contribution: Y. Akahoshi designed the study, analyzed the data, and wrote the manuscript; Y.T., E.S., M.K., and H.N. reviewed and revised the manuscript; N.D., N.U., M.T., M.S., Y. Katayama, K.-i.M., and Y.O. provided important clinical data; T.F., M.O., J.K., and Y. Atsuta collected the patient data; Y. Kanda provided important clinical data, advised statistical methods, and reviewed and revised the manuscript; and all authors contributed to the writing of the report and approved the final version of the article.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
Correspondence: Yu Akahoshi, Division of Hematology, Jichi Medical University Saitama Medical Center, 1-847 Amanuma, Omiya-ku, Saitama-city, Saitama 330-8503, Japan; e-mail: email@example.com.
Data are available on request from the corresponding author, Yu Akahoshi (firstname.lastname@example.org).
The full-text version of this article contains a data supplement.