Characterizing “fitness” in the context of therapeutic decisions for older adults with acute myeloid leukemia (AML) is challenging. Available evidence is strongest in identifying those older adults who are frail at the time of diagnosis by characterizing performance status and comorbidity burden. However, many older adults with adequate performance status and absence of major comorbidity are “vulnerable” and may experience clinical and functional decline when stressed with intensive therapies. More refined assessments are needed to differentiate between fit and vulnerable older adults regardless of chronologic age. Geriatric assessment has been shown to add information to routine oncology assessment and improve risk stratification for older adults with AML. This review highlights available evidence for assessment of “fitness” among older adults diagnosed with AML and discusses future treatment and research implications.
To recognize patient characteristics that influence treatment tolerance for older adults receiving chemotherapy for AML
The majority of patients with acute myeloid leukemia (AML) are age 65 or older, with approximately one-third of patients ≥75 years of age at diagnosis.1 Despite this, the optimal treatment for older patients remains unclear. Compared with middle-aged patients, older adults (typically defined as ≥60 or 65 years) experience shortened survival and increased treatment-associated morbidity.1-3 For example, population-based data from the United States (Surveillance Epidemiology End Results, SEER) highlight the age-related survival disparity. Rates of 5-year survival from time of diagnosis decline from 39% to 8.5% to <2% for people <65, 65–74, and ≥75 years of age, respectively.1 Older adults are more likely to experience treatment-related death than younger patients, ranging from 10% to 30% in many clinical trials.2-4 Chronologic age, however, is a surrogate measure for both changes in tumor biology (conferring treatment resistance) and patient characteristics (affecting treatment tolerance). Both tumor biology and physiologic reserve vary widely among older adults of similar chronologic age, necessitating individualized assessment strategies.
Controversy of intensive therapy for older adults with AML
Concerns regarding efficacy and toxicity of standard treatments have resulted in <40% of older adults receiving chemotherapy for AML in the United States.5 Despite poor outcomes for older adults in aggregate, clinical trial and observational data show that chemotherapy can improve survival for selected patients, even those >80 years of age.4-7 A landmark study comparing intensive induction in a randomized fashion to supportive care demonstrated a small but measurable survival advantage for patients 65 years of age and older.4 Survival has improved over time in both observational data and clinical trials, although the magnitude of improvement declines with increased age.1,5,8-11 Observational data also suggest that the effect of induction therapy on quality of life (QOL) and functional status may be similar among fit older and younger adults.12,13 Less intense therapies are increasingly being used5 and hold promise for the treatment of older adults with AML,14-16 but none have yet been shown to be superior to intensive therapy from the standpoint of efficacy or QOL. It is difficult to compare outcomes between intensive and less intensive strategies directly across clinical trials in part due to inconsistent eligibility characterizations of fit versus unfit older adults.
Ideally, at the pretreatment evaluation, we want to be able to identify which older adults are fit (ie, will tolerate and benefit from treatment in a similar fashion to a middle-aged person), versus vulnerable (ie, at risk for clinical or functional decline during or after treatment that may mitigate some of the treatment benefit) versus frail (ie, will have significant increased complications related to therapy). This would improve patient-centered treatment decision making, provide specific targets for supportive care interventions, and facilitate uniform risk stratification for optimal clinical trial design.
Individualizing patient assessment
Prognostic models have been developed from clinical trial data to improve outcome prediction for older adults (Table 1).17-21 Using algorithms derived from these risk stratification models, estimates of early mortality (16%- 71%17 ), complete remission (CR) (12%–91%18 ), and 3-year survival (3%–40%19 ) range widely among older adults who receive intensive induction therapy. A model predicting 8-week induction mortality for patients ≥70 years of age includes age >80 years, complex cytogenetics, Eastern Cooperative Oncology Group (ECOG) performance status >1, and creatinine >1.3 mg/dL.17 Patients with risk factors ranging from none (28%) to 1 (40%), 2 (23%), and ≥3 (9%) had 8-week mortality rates of 16%, 31%, 55%, and 71%, respectively. A model to predict overall survival (OS) identified age, karyotype, NPM1 mutational status, white blood cell count, lactate dehydrogenase levels, and CD4 expression as risk factors and categorized patients into 4 groups, with 3-year OS ranging from 3% to 40%.19 A model predicting remission rates and induction mortality used clinical and laboratory variables (body temperature, age, secondary leukemia or antecedent hematological disease, hemoglobin, platelet count, fibrinogen, and lactate dehydrogenase) and predicted CR rates ranging from 12% to 91%.18 This algorithm has been developed into a web-based application for ease of use. A fourth model derived from >1000 intensively treated patients identified cytogenetic risk group, white blood cell count, secondary AML, performance status, and age as predictors of OS.20
Each of these algorithms provides useful information for improving risk stratification at the time of treatment decision making. Each model, however, primarily explores the heterogeneity of tumor biology and relies on chronologic age as a surrogate for measureable patient-specific factors that also vary among similarly aged individuals (ie, comorbidity, physical function, cognition, psychological state, and nutritional status). Systematic measurement of patient-specific factors can help better discriminate among fit, vulnerable, and frail patients for a given treatment. Identifying those patient-specific factors that most directly influence treatment tolerance in the setting of AML therapy is an active area of research; current evidence is reviewed below.
Performance status strengths and limitations
Oncology performance status (PS) scales such as the ECOG or the Karnofsky Performance Score (KPS) are widely used and are useful in identifying those older adults at highest risk for complications (ie, frail patients) in the context of intensive therapy for AML. Older adults with poor oncology PS at the time of treatment (ie, ECOG PS 3 or 4 regardless of the underlying cause) have a high probability of treatment toxicity and a lower likelihood of benefit. The relationship between ECOG PS at diagnosis, age, and 30 day mortality during intensive induction is dramatic. Trial data from the Southwest Oncology Group (N = 968) show similar 30 day mortality (11%–15%) for patients aged 56-65, 66-75, or >75 with ECOG 0 compared with rates of 29%, 47%,and 82%, respectively, for those with ECOG 3 at the time of treatment.2 Although poor PS at diagnosis is a risk factor for treatment-related complications regardless of age, the magnitude of the negative impact of poor PS increases with age. Not surprisingly, OS also declines with worse PS at the time of diagnosis and treatment. In a study of 998 patients ≥65 years, 1-year survival rates were 35%, 25%, and 7% for adults with ECOG PS score of 1, 2, and ≥3, respectively.3
The limitations of oncology PS scales are that they are not sensitive enough to differentiate patients with subclinical vulnerability from those who are fit. Physiologic reserve capacity varies widely even among older adults with ECOG 0-1 due in part to subjectivity of the scale. Further refinement is needed to identify vulnerable adults. A single-institution study of older adults treated with intensive therapy identified significant physical impairments, such as 48% activities of daily living (ADLs) and 54% objectively tested physical performance, among patients with ECOG ≤1.22 In another study, patients who reported needing assistance with instrumental activities of daily living at diagnosis had decreased survival, independent of age and KPS.23 These data suggest that simple standardized measures to assess functional status can enhance the pretreatment evaluation for older adults. Further details on functional assessment will be described in the section on geriatric assessment (GA).
Comorbidity is common among older adults with AML and influences treatment administration and tolerance.5 A study using population data (SEER) including >5000 adults diagnosed with AML (median age 78) showed that half had at least one major comorbidity based on claims data.5 Despite this, multisite AML treatment trials do not consistently capture or report upon comorbidity in a standardized fashion, limiting the evidence base to smaller studies and population-based data. Comorbidity is typically measured using standardized indices to assess burden and severity of diseases. The most commonly used are the Charlson Comorbidity index (CCI) and the Hematopoietic Cell Transplantation Comorbidity Index (HCT-CI).24,25 Of available studies, most show a relationship between higher comorbidity burden and worse clinical outcomes.
In a retrospective study of 133 patients aged ≥70 years given induction chemotherapy, a CCI score >1 (major comorbidity, 32%) was an independent adverse prognostic factor for CR (35% vs 63%, p = .05).26 The HCT-CI, developed to improve the sensitivity of the CCI in the transplantation setting, has been used in AML studies. Among 177 patients ≥60 years of age who received induction chemotherapy, the HCT-CI score was 0 in 22%, 1–2 in 30%, and ≥3 in 48%, corresponding to early death rates (3%, 11%, and 29%, respectively) and OS (45, 31, and 19 weeks, respectively).27 Two other retrospective studies have shown that higher comorbidity burden (using cutoffs of >1 or >3 on the HCT-CI scale) is independently associated with higher mortality among older adults.28,29 In contrast, a study (N = 92) investigating the predictive utility of either comorbidity index (CCI or HCT-CI) among octogenarians found no association between comorbidity burden and survival.30 Although the prevalence of comorbidity is lower among patients enrolled in intensive induction trials, an HCT-CI score ≥3 was associated with shorter survival among 416 older adults enrolled on ALFA-9803.21 Population-based data (SEER) also show an association between higher comorbidity burden at diagnosis (claims-based CCI) and higher 8-week mortality and lower OS among adults ≥65 years of age treated for AML.5
Based on available evidence, screening for major comorbidity as a method for identifying frail older adults should be considered in routine practice. Either the CCI or HCT-CI is a reasonable option in this regard. Many questions related to comorbidity remain unanswered, including how to adjust treatment plans based on comorbidity burden. The prognostic implications of many individual comorbid conditions are still unknown. Consistent inclusion of standardized comorbidity assessment in randomized treatment trials will enhance our understanding of how to tailor therapies to individual older adults.
Accounting for complexity: the case for GA
To adequately assess fitness, we need more sensitive tools and more comprehensive assessment. In considering only performance status and comorbidity, we are missing multiple other measurable characteristics that may influence treatment tolerance directly or indirectly (Figure 1). Equally importantly, these tools need to be simple and time efficient if they are to be used in real-time clinical practice. GA is an approach to the evaluation of multiple patient characteristics (ie, physical function, comorbid disease, cognitive function, psychological state, social support, polypharmacy, nutritional status) to help characterize individual complexity and discriminate among fit, vulnerable, and frail patients. At the most basic level, this is a way of recording and interpreting information collected from a very thorough history and physical in a standardized fashion. In the context of other cancers, GA has been shown to predict chemotherapy toxicity and survival.31-33
GA is feasible to administer among older adults with newly diagnosed AML and detects significant variability in patient characteristics that are not routinely captured by standardized assessments.22 In a prospective single-institution study of adults ≥60 years of age with newly diagnosed AML treated intensively, pretreatment GA detected significant impairments even among those with ECOG 0-1: cognitive impairment, 24%; depression, 26%; distress, 50%; ADL impairment, 34%; impaired physical performance, 31%; and comorbidity using the HCT-CI, 40%.22 Importantly, most patients were impaired in one (92.6%) or more (63%) measured characteristics. The additive effects of multiple impairments may be more important than individual conditions and the implications may differ by treatment intensity.
There is evidence that GA can inform prediction of outcomes for older adults with AML.29,34,35 In fact, current evidence would suggest that chronologic age (at least among those 60-80 years) may not be a robust predictor of outcome after accounting for individual patient characteristics (function, comorbidity, symptoms) measured by GA (Table 2). In a single-institution prospective study of adults ≥60 years of age treated with intensive induction therapy, GA performed at diagnosis was associated with OS.35 In this study (N = 74, median age 68 years), the following characteristics were evaluated using standardized measures: physical function (self-reported and objectively measured), cognitive function, comorbidity, distress, and depressive symptoms. Most participants had a good ECOG PS (78% ECOG ≤1) at study entry. Objectively measured physical function was evaluated using a validated testing battery (Short Physical Performance Battery, SPPB) that includes a timed 4 m walk, chair stands, and balance testing scored from 0 (worst) to 12 (best).36-38 Using established cutoffs for impairment in each measure, the impaired physical performance (SPPB <9) and cognition (Modified Mental State Exam, 3MS, score <77) were independently associated with OS after accounting for tumor and clinical characteristics. Patients without impairment in their objectively measured physical function (SPPB ≥9) survived a median 10 months longer than those who were impaired. A similar magnitude of effect was seen for cognitive impairment. Age and ECOG PS score were not independently associated with survival in this study. These data suggest that, among patients considered fit for intensive therapy using standard clinical criteria, measurement of physical performance and cognition may help identify meaningful vulnerability. Efforts to validate this type of assessment in the multisite cooperative group setting are ongoing and will further inform generalizability of GA administration in practice.
GA has also been investigated among older adults with AML receiving nonintensive therapy. A multisite observational study investigated the predictive value of GA among patients with myelodysplastic syndrome (N = 63) and AML (N = 132) in a mixed treatment population including: best supportive care (N = 47), hypomethylating agents (N = 73), and intensive induction (N = 75).34 The GA battery measured physical function by self-report (ADL and instrumental ADL) and objective assessment (timed up and go test), cognition, mood, and QOL (EORTC Quality of Life Questionnaire C30). This study again highlighted significant heterogeneity among measured patient characteristics, with many patients screening positive for impairments that may be underrecognized in clinical practice. As expected in this observational study, patients in the nonintensively treated group were more impaired on the GA measures than those in the intensively treated group. In addition to KPS <80, 2 measures from the GA were independently associated with OS among nonintensively treated patients: requiring assistance with ADLs and high fatigue score from the QOL questionnaire. These 3 variables were used to create a fitness score; 0 (no impairments, low risk), 1-2 (intermediate), and 3 (high risk). OS differed for patients in the low-risk, intermediate-risk, and high-risk groups (median survivals 774, 231, and 51 days, respectively, p < .01). The fitness score was not able to adequately predict survival among those treated intensively n = 75), suggesting that the characteristics most useful in defining fitness and vulnerability may differ by treatment setting and population studied.
Finally, another single-institution study (N = 101) used registry data to retrospectively reconstruct a GA using information collected from a QOL questionnaire and comorbidity assessment.29 The study population was ≥65 years of age, with only 35% receiving intensive therapy. The investigators used specific questions from the QOL survey that addressed the domains of physical, social, cognitive, psychological function, nutritional status, and pain. In multivariate analysis, higher comorbidity (HCT-CI >1), reported difficulty with strenuous activity, and pain were associated with mortality after controlling for adverse cytogenetics, ECOG PS, and secondary AML. Although this study design lacked sensitivity in the assessment measures (ie, the QOL questionnaire was not designed to screen for cognitive dysfunction), it does suggest that simple targeted questions regarding specific symptoms or physical functioning may help to identify vulnerability.
Overall, the available evidence suggests that we can learn clinically meaningful information to assist in treatment planning by performing additional assessment of multiple patient characteristics. The most promising predictors are measures of physical function (task specific or objectively measured), cognition, and symptoms. Current evidence is limited to relatively small sample sizes, with few patients >80 years of age represented. Although validation is needed, available data can be used to begin to differentiate among fit, vulnerable, and frail patients in the context of AML therapy. Table 3 proposes the use of best available data to suggest risk group stratification based on measured patient characteristics.
Until further evidence is available, it would be reasonable to perform focused assessment of physical function, comorbidity, cognition, and symptoms, particularly among patients with ECOG 0-2 to assess fitness for intensive therapy. Assessment of SPPB, 3MS, and comorbidity by a nurse can be done in 15 minutes. Screening questions addressing fatigue and pain burden can be readily incorporated into usual care. Although shorter screening tools have yet to be validated in AML,39 it would be reasonable to substitute measurement of gait speed alone,40,41 mobility questions (ie, difficulty with strenuous activity, difficulty walking one block32,42 ), and a shorter cognition screen such as the Blessed Orientation-Memory-Concentration Test (see supplemental materials).32,42,43 This could inform clinical decision making and would require <10 minutes to perform in a clinical setting.
Next steps to move research to practice
Next steps will require validation of practical GA measures in larger multisite trials with uniform treatment approaches. The feasibility of performing GA in cooperative group multisite treatment trials has been tested. Preliminary results show that a primarily self-administered GA that includes a brief cognition screen and a physical performance test administered by a nurse can be done before initiation of induction chemotherapy.39 The median time to complete the entire assessment was 30 minutes, of which 10 minutes required a nurse's time to administer. The study nurses reported no difficulties in administration of the assessments and patient satisfaction with the length was high (82%). Although GA appears feasible in the multisite setting, ongoing studies are needed to validate the most predictive and efficient measures to be used in practice.
Once individual assessment strategies are validated, it is critical that core batteries of uniform measures are used in all elderly-specific treatment trials to allow for cross-study comparisons and to enhance dissemination into clinical practice. Emerging evidence suggests that risk factors predictive of induction outcomes are also predictive of outcomes after BM transplantation,44 further highlighting the need for systematic approaches to patient assessment in trials and practice. Ultimately, understanding specific patient vulnerabilities will help to: (1) predict treatment tolerance and benefit for available therapies, (2) inform novel clinical trial design to target specific patient subgroups and explore the relationship between tumor and patient biology, and (3) identify targets for intervention to improve supportive care during therapy (ie, exercise for physical impairment45 ).
Conflict-of-interest disclosure: The author declares no competing financial interests. Off-label drug use: None disclosed.
Heidi D. Klepin, MD, MS, Department of Internal Medicine, Section on Hematology and Oncology, Wake Forest School of Medicine, Medical Center Boulevard, Winston-Salem, NC 27157; Phone: (336)716-4392; Fax: (336)716-5687; e-mail: firstname.lastname@example.org.