2021 Update Measurable Residual Disease in Acute Myeloid Leukemia: European LeukemiaNet Working Party Consensus Document

: Measurable residual disease (MRD) is an important biomarker in acute myeloid leukemia (AML) that is used for prognostic, predictive, monitoring, and efficacy-response assessments. The European LeukemiaNet (ELN) MRD working party evaluates standardization and harmonization of MRD in an ongoing manner and has updated the 2018 ELN MRD recommendations based on significant developments in the field. New and revised recommendations were established during in-person and online meetings, and a two-stage Delphi poll was conducted to optimize consensus. All recommendations are graded by levels of evidence and agreement. Major changes include technical specifications for next generation sequencing (NGS)-based MRD testing and integrative assessments of MRD irrespective of technology. Other topics include use of MRD as a prognostic and surrogate endpoint for drug testing; selection of the technique, material, and appropriate time points for MRD assessment; and clinical implications of MRD assessment. In addition to technical recommendations for flow- and molecular- MRD analysis, we provide MRD thresholds and define MRD response, and detail how MRD results should be reported and combined if several Abstract Measurable residual disease (MRD) is an important biomarker in acute myeloid leukemia (AML) that for predictive, monitoring, and assessments. The evaluates standardization and harmonization of MRD in an ongoing manner and has updated the 2018 ELN MRD recommendations based on significant developments in the field. New and revised recommendations were established during in-person and online meetings, and a two-stage Delphi poll was conducted to optimize consensus. All recommendations are graded by levels of evidence and agreement. Major changes include technical specifications for next generation sequencing (NGS)-based MRD testing and integrative assessments of MRD irrespective of technology. Other topics include use of MRD as a prognostic and surrogate endpoint for drug testing; selection of the technique, material, and appropriate time points for MRD assessment; and clinical implications of MRD assessment. In addition to technical recommendations for flow- and molecular- MRD analysis, we provide MRD thresholds and define MRD response, and detail how MRD results should be reported and combined if several techniques are used. MRD assessment in AML is complex and clinically relevant, and standardized approaches to application, interpretation, technical conduct, and reporting are of critical importance.


Introduction
Assessment of measurable residual disease (MRD) in acute myeloid leukemia (AML) is challenging. Several technologies are available for MRD quantification, but the assays and reporting lack standardization and comparability. Still, detection of MRD by any methodology during morphological remission after standard chemotherapy is a strong prognostic factor for subsequent relapse and shorter survival in AML patients. 1 MRD monitoring may have value in guiding post-remission therapy and identifying early relapse and as a surrogate endpoint in clinical trials to accelerate development of novel regimens. MRD assessment in AML has elicited considerable interest from clinicians, patients, regulatory authorities, industry, and researchers, and guidance in harmonization, refinement, and validation of MRD testing is needed.
The goal of the ELN AML MRD expert panel was to update our previous consensus article and provide our latest insights and expert recommendations on different technologies and current clinical uses of MRD. 2 The updated guidelines were written according to consensus achieved using a Delphi poll (methods in Supplemental information and Table S1) and the overall results are summarized in Tables 1a-d. 3 Terminology Since the 2018 guidelines, 2 we have replaced the term "minimal residual disease" with "measurable residual disease". A "positive" or "negative" MRD test result refers to the detection, or not, of measurable disease above specific thresholds that may vary by assay and by laboratory. Clinicians are advised to clarify the interpretation of individual MRD results with their MRD laboratory colleagues. It is important to recognize that a negative MRD result does NOT necessarily indicate disease eradication but, rather, represents disease below the assay's threshold in the tested sample and patients may still experience relapse. Also, an MRD assay with a non-zero result may still be called "negative" by a laboratory if the level detected is below the threshold linked to prognosis.

Multiparametric Flow Cytometry (MFC) MRD Testing
Immunophenotyping is an essential, readily available tool for diagnosing AML and is currently the most commonly used MRD detection methodology. Supplemental Table S2 summarizes recent clinical studies incorporating MFC-MRD assessment in AML, including for randomized treatment comparisons 4,5 and MRD-directed therapy. 6,7 Here, we update current best practices (Table 1a). Our consensus recommendations for optimized technical requirements for MFC MRD are described in a separate manuscript (Tettero et al., submitted).

a) "Leukemia-Associated ImmunoPhenotype" (LAIP) and "Different from Normal" (DfN)
The flow cytometry expert panel continues to recommend integration of diagnostic LAIP and DfN aberrant immunophenotype approaches to allow tracking of diagnostic and emergent leukemic clones. Both approaches require expertise in the recognition of aberrant populations and exclusion of potential background as part of assay validation. Ideally, a diagnostic sample is preferred to determine if a patient has diagnostic flow cytometric MRD targets that can be tracked (recommendation A1). Implementation of a common, minimum required set of tubes/fluorochromes is a prerequisite for harmonized MRD detection, analysis, and reporting (recommendation A2). We recommend harmonized use of the integrated diagnostic-LAIP and DfN strategy for MRD detection that incorporates core MRD markers CD34, CD117, CD45, CD33, CD13, CD56, CD7, HLA-DR to assess all samples (recommendation A3). Some investigators favor addition of CD38 whenever possible, as CD38 adds specificity to certain aberrant leukemic immunophenotypes, particularly for the CD34+CD38low/-compartment, when markers such as CD56, CD7 and others like CD45RA designated as leukemic stem cell markers are aberrantly expressed. In cases with a monocytic component, additional markers (e.g., CD64, CD11b, CD4) may also be relevant. 8 The DfN approach detects aberrant clones regardless of immunophenotypic shifts, since it does not rely on the stability of a diagnostic LAIP during treatment, but defines "empty spaces" not occupied by cells within the normal differentiation profiles of bone marrow (BM) or peripheral blood (PB). 9 The panel advises the combined LAIP/DfN approach, but notes that some abnormal immunophenotypes may appear and/or disappear during monitoring, potentially due to transient expression on regenerating non-leukemic progenitors. [10][11][12] This phenomenon may affect the respective specificities of both LAIP and DfN MRD detection, in particular when the percentages of LAIPs at lower thresholds (e.g. <0.1%) are investigated.
Particular attention should be devoted to evaluating expression of the identified aberrant immunophenotypes in control samples that include regenerating BM (recommendation A4).
When immunophenotypic abnormalities in specific samples may reflect transient features of The panel strongly recommends submitting the first pull of BM aspirate for MRD analysis, as sample quality is critical for accurate results. 13 The sample should be processed undiluted within 3 days of storage at ambient conditions (recommendation A6). For samples stored at ambient temperature >3 days, the MRD report should make specific note of sample quality and potentially compromised cell viability (recommendation A7).
Sample preparation can be performed using two accepted techniques: 1) bulk lysis, followed by wash/stain/wash; or 2) stain/lyse/wash or no-wash. 9,14 Whichever technique is selected should reliably produce high quality MFC measurements (i.e. optimal cell concentration and no loss of forward or sideward scatter properties) and should be applied consistently across samples.
Basic principles for instrument settings are described elsewhere and we suggest using standard operating procedures developed by international flow cytometry consortia. 15,16 Also, efforts should be made to evaluate sample quality with respect to PB contamination. 17,18 In general, our recommendation is for each laboratory to explore strategies to assess hemodilution that can be incorporated and reported as part of the MRD assay (recommendation A8).

c) Gating strategies and calculations for MFC MRD
MFC-MRD assessment used for clinical decision making should be performed with a qualified assay as based on the guidelines for rare events in MFC (recommendation A9). [19][20][21][22] Acquisition should collect the highest possible number of relevant events and, accordingly, to ensure quality of relevant events acquisition, use a gating syntax including Forward Scatter (FSC) versus time and doublet exclusion plots [e.g. FSC-Area vs. FSC-Height] (recommendation A10). Viability can be assessed by the addition of a viability dye or simply by accurate gating based on physical parameters (low FSC vs. low Side Scatter (SSC)). As with the previous guidelines, the recommendation remains that the standard for determining MFC MRD negativity is to acquire >500,000 CD45 expressing cells and at least 100 viable cells in the blast compartment assessed for the best aberrancy(s) available (recommendation A11).
In order to reliably use flow MRD for clinical decision making, studies of the lower limit of detection (LLOD) and lower limit of quantification (LLOQ) are essential. Thus, the panel recommends that LLOD and LLOQ should be calculated to assess MFC-MRD assay performance (recommendation A12) for each panel combination used. This statement aligns with the advice of regulatory agencies, which emphasizes that reporting MRD negative results without LLOD information is not meaningful. 23

a) Approaches and technical requirements for molecular MRD assessment
There are two approaches to molecular MRD assessment: PCR and next generation sequencing (NGS). 25 The recommendations are summarized in Table 1b. Techniques for molecular MRD assessment should reach a limit of detection (LOD) of 10 -3 or lower with technically validated assays 26 using quantitative polymerase chain reaction (qPCR), digital PCR (dPCR), or error corrected NGS with unique molecular identifiers (UMIs) (recommendation B1).
The recommended PCR approaches include classical quantitative real time PCR (qPCR) using fluorescent probes and digital PCR (dPCR). The applicability of PCR is limited to the approximately 40-60% of AML cases with one or more targetable abnormalities, including mutated NPM1, RUNX1-RUNX1T1, CBFB-MYH11, PML-RARA, KMT2A-MLLT3, DEK-NUP214, BCR-ABL and WT1. 27 Molecular MRD analysis for NPM1 or fusion genes is usually performed from RNA/cDNA due to the high expression of these genes and thus better sensitivity. 28 Both PB and BM can be used for molecular MRD assessment, though sensitivity may be 5-10 fold lower in PB compared to BM. 29 Either EDTA or heparin can be used as the anticoagulant on samples for molecular MRD analysis (recommendation B2). A potential inhibitory activity of heparin on PCR reactions has been noted and the anticoagulant effect should be assessed during assay validation. 30 To avoid hemodilution, only 5 ml of bone marrow aspirate should be used for molecular MRD assessment from the first pull (or the first pull after repositioning, if the initial pull is used for MFC-based MRD testing (recommendation B3). BM smears for morphologic assessment (0.2-0.5 mL) should be prepared immediately from a few drops of aspirate from the first pull syringe. If PB is used for molecular MRD, at least 10 mL is required, depending on the white blood cell count and assay characteristics.
The method of cell isolation should be kept consistent as it may alter the leukemic cell percentage (e.g., Ficoll separation of PB to reduce dilution of leukemic cells with normal granulocytes or lysis of whole blood; recommendation B4).

b) qPCR-based molecular MRD assessment
Technical requirements for qPCR are largely unchanged from the 2018 guidelines (see Supplemental Information). 2

c) NGS-based molecular MRD assessment
Targeted NGS-based MRD testing using specific mutations identified at diagnosis versus agnostic panel approaches have different strengths and limitations, but both approaches can be considered, depending on sensitivity, turnaround time, resource use, setting (research, clinical trial, clinical routine), and ability to standardize methodology and reporting (recommendation B6). 32 DNA is the standard nucleic acid used for NGS-MRD testing. Prognostic impact has been shown for selected mutations present at diagnosis and/or in CR samples. 33,34 If a panel approach is used, emerging variants not found at diagnosis should be reported only if confidently detected above background noise (recommendation B7).
For the NGS-MRD assessment, the goal should be a read depth that allows clear discrimination of the target from noise (see Supplemental Information). Nucleic acid contamination may be reduced by changing the combinations of multiplex identifiers with target sequences from run to run, and by thorough washing of the sequencer between runs.
Diagnostic samples should not be combined with MRD samples in the same run, as highly abundant mutations increase the risk of contamination. Technical requirements for NGS-MRD testing are further detailed in the Supplement.

d) Selection of MRD markers for NGS-MRD
Diagnostic AML samples are generally screened for mutations using a multi-gene panel. For NGS-MRD, we recommend considering all detected mutations as potential MRD markers, with the limitations detailed below 35 (recommendation B8). This may apply also to NPM1 mutated patients, as NPM1 mutation negative relapse was reported in patients who previously were NPM1 mutation positive [36][37][38] . This might be especially relevant in patients with morphological or clinical signs of recurrent disease, since AML and MDS developing from clonal hematopoiesis has been documented in NPM1-negative patients during followup 38,39 . In addition, of 150 NPM1 mutated patients in complete molecular remission, 15% had at least one non-DTA (DNMT3A, TET2, ASXL1) mutation that persisted or was acquired at the time of CR assessment and predicted significantly shorter overall survival 40 .
Germline mutations (VAF of ~ 50% in genes ANKRD26, CEBPA, DDX41, ETV6, GATA2, RUNX1, TP53) should be excluded as NGS-MRD markers, as they are non-informative for MRD 41 (recommendation B9). DTA mutations can be found in age-related clonal hematopoiesis and should be excluded from MRD analysis (recommendation B10), as mutations associated with clonal hematopoiesis often persist during remission and thus may Downloaded from http://ashpublications.org/blood/article-pdf/doi/10.1182/blood.2021013626/1832241/blood.2021013626.pdf by guest on 01 November 2021 not represent the leukemic clone. [42][43][44][45][46] If the only detectable mutations are in DTA genes, we recommend using MFC and/or PCR for MRD assessment. Mutations in signaling pathway genes (e.g., FLT3-ITD, FLT3-TKD, KIT, KRAS, NRAS etc.) likely represent residual AML when detected, but are often subclonal and have a low negative predictive value. These mutations are best used in combination with additional MRD markers (recommendation B11). NGS-MRD analysis in patients treated with targeted agents (FLT3 inhibitors, IDH1/IDH2 inhibitors) should include the molecular marker that is targeted, but also others that are present in the sample (recommendation B12). 47,48 A basic set of genes that covers a large proportion of AML patients and therefore may be useful in a panel approach is shown in Supplemental Table S3. Potential cross-sample sequence contamination as a result of pooling samples in NGS-MRD should be bioinformatically evaluated (recommendation B14).

Future Goals
General MRD assays, analytical tools, and reporting standards, all require standardization and harmonization. Qualification of each MRD approach is essential for clinical-decision making, in particular in light of the planned in-vitro diagnostics regulation (IVDR) of the European Union. 49 Inter-laboratory tests are being performed within the ELN for MFC, qPCR-based NPM1 analysis and NGS-MRD, and multicenter initiatives are encouraged. 50 Turnaround time, cost, sensitivity and effects of clonal evolution should be compared between these approaches. The recommended MRD cutoffs of the major MRD technologies should be validated in the ELN risk groups, and the value of alternative cut-points should be evaluated.
In addition, clinical studies should investigate whether MFC and molecular MRD have distinct applications or should be used in combination for optimal impact. High quality flow cytometry data (standardized instrument settings, pre-analytics and measurements) are required for future automated analyses (recommendation C4).

II CLINICAL IMPLEMENTATION
MRD assessment in AML can be used as a (1) prognostic/predictive biomarker to refine risk assessment and inform treatment decision-making; (2) monitoring tool to identify impending relapse; and (3) potential surrogate endpoint for overall survival in clinical trials to accelerate the development of novel treatment strategies (Table 1d).

MRD as prognostic risk factor
MRD should be assessed to refine relapse risk in patients who achieve morphologic remission, with full or partial hematologic recovery (CR/CR i /CR p /CR h ) 1 (recommendation D1). MRD positivity in AML patients treated with intensive chemotherapy is associated with inferior outcomes. 1 Preliminary data suggest that MRD positivity after non-intensive induction is also associated with poor outcomes. 60-63

Selecting the technique, material, and appropriate time points for MRD assessment
MFC MRD has been established as prognostic factor after induction chemotherapy on BM. [64][65][66][67] Particularly for longer-term follow-up, MRD monitoring using PB would be beneficial and may be informative from recent evidence; however further research is needed with regard to its sensitivity and specificity. 56, [68][69][70] Ideally, potential MRD markers should be identified at diagnosis using MFC and molecular persistence with low copy numbers MP-LCN) is provisionally defined as <2% but above the detection limit of the assay (ratio of the target and housekeeping genes) 80 . MRD-LL is associated with a very low relapse risk in NPM1-mutated patients when measured at the end of consolidation chemotherapy (recommendation D14). The optimal dPCR threshold level has not yet been evaluated in sufficiently large patient cohorts. dPCR test positivity (measured on genomic DNA) is provisionally defined as ≥ 0.2% VAF . The discriminating threshold for dPCR when using cDNA needs further validation.
The optimal NGS-MRD threshold level that best discriminates subsequent relapse risk has

Integration of multi-modality MRD results
MRD positivity by any methodology is sufficient to suspect poor clinical risk. Available data suggest that patients with one positive and one negative MRD result from two different techniques have a higher relapse risk than patients with two negative MRD results, but a lower relapse risk than patients with two positive MRD results 42,43 (recommendation D18).
Future studies are needed to integrate the results of multiple MRD assays into one prognostic score. trials, should report data using the thresholds and response definitions in this manuscript (recommendation D20).

Clinical consequences of MRD assessment
Failure to achieve MRD-negative remission by MFC, molecular MRD-positivity after completion of consolidation chemotherapy, and/or MRD relapse (either molecular or MFC, as

Suggestion for further improvements in clinical implementation
Future studies should evaluate whether MRD assessment is feasible and has prognostic value in patients who achieve a morphologic leukemia free state (MLFS). The prognostic relevance of MRD in non-intensive AML treatment regimens 66           In figure 2 the time points and MRD cutoffs are indicated at which a MRD result may impact the therapeutic decision for a given patient. For example, in an NPM1 mutated AML patient who is monitored by qPCR, MRD persistence at ≥2% NPM1 mutant copies/ABL1 copies at the end of chemotherapy may trigger the decision to consider alloHCT for this patient. 1 After 2 cycles of chemotherapy (either 2 induction cycles or 1 induction and 1 consolidation cycle); this also includes the time point before alloHCT. 2 Percentage NPM1 mutant copies per ABL1 copies measured in BM. 3