Abstract

Positive interim positron emission tomography (PET) scans are thought to be associated with inferior outcomes in diffuse large B-cell lymphoma. In the E3404 diffuse large B-cell lymphoma study, PET scans at baseline and after 3 cycles of rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone were centrally reviewed by a single reader. To determine the reproducibility of interim PET interpretation, an expert panel of 3 external nuclear medicine physicians visually scored baseline and interim PET scans independently and were blinded to clinical information. The binary Eastern Cooperative Oncology Group (ECOG) study criteria were based on modifications of the Harmonization Criteria; the London criteria were also applied. Of 38 interim scans, agreement was complete in 68% and 71% by ECOG and London criteria, respectively. The range of PET+ interim scans was 16% to 34% (P = not significant) by reviewer. Moderate consistency of reviews was observed: κ statistic = 0.445 using ECOG criteria, and κ statistic = 0.502 using London criteria. These data, showing only moderate reproducibility among nuclear medicine experts, indicate the need to standardize PET interpretation in research and practice. This trial was registered at www.clinicaltrials.gov as #NCT00274924.

MedscapeCME Continuing Medical Education online

This activity has been planned and implemented in accordance with the Essential Areas and policies of the Accreditation Council for Continuing Medical Education (ACCME) through the joint sponsorship of Medscape, LLC and the American Society of Hematology. Medscape, LLC is accredited by the ACCME to provide continuing medical education for physicians. Medscape, LLC designates this educational activity for a maximum of 0.25 AMA PRA Category 1 credits™. Physicians should only claim credit commensurate with the extent of their participation in the activity. All other clinicians completing this activity will be issued a certificate of participation. To participate in this journal CME activity: (1) review the learning objectives and author disclosures; (2) study the education content; (3) take the post-test and/or complete the evaluation at http://cme.medscape.com/cme/blood; and (4) view/print certificate. For CME questions, see page 918.

Disclosures

The authors, the Associate Editor Martin S. Tallman, and the CME questions author Charles P. Vega, University of California, Irvine, CA, declare no competing interests.

Learning objectives

Upon completion of this activity, participants should be able to:

  1. Identify study procedures in the current research

  2. Specify the interobserver agreement in regard to interim positron emission tomographic (PET) scans in the current study

  3. Describe the current therapeutic approach to diffuse large B-cell lymphoma

  4. List common anatomic sites of disagreement between nuclear medicine specialists in the current study

Introduction

Remarkable predictive accuracy with midtreatment 18F-fluorodeoxyglucose positron emission tomography (PET) scans has been reported in diffuse large B-cell lymphoma (DLBCL), based on the concept that tumor burden above or below the threshold of detection after 1 to 3 chemotherapy cycles results in treatment failure or success.1  Although guidelines for PET interpretation in clinical trials have been issued, their reproducibility has not been studied carefully.2  During conduct of the DLBCL E3404 study, the rate of PET+ interim scans was lower than projected, and we therefore convened an expert panel to blindly review baseline and interim PET scans from approximately the first one-third of participants to assess reproducibility.

Methods

After a baseline PET scan, bulky or advanced DLBCL patients received 3 cycles of rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone (R-CHOP), followed by a PET scan 14 to 20 days later. During the central PET review by a single reader, a fourth cycle of R-CHOP was given and patients continued R-CHOP if PET or changed to rituximab, ifosfamide, carboplatin, and etoposide if PET+. Scans were obtained on dedicated high-resolution PET or PET/computed tomography (CT) scanners according to protocol and quality control standards at participating Eastern Cooperative Oncology Group (ECOG) sites. Centralized PET review of baseline and interim scans was performed via file transfer or compact disc with DICOM images. The protocol specified a binary visual interpretation, which the central reviewer based on modifications of the International Harmonization Project, customized for E3404 interim scans and deemed the “ECOG criteria”: (1) only sites of abnormality at baseline are evaluated; (2) abnormal activity requires both a focal appearance and intensity greater than average liver; (3) all positive nodal sites must have an anatomic correlate; (4) activity in bone marrow and spleen is considered abnormal only if focal and clearly discernible; (5) symmetric abnormal foci in the mediastinum and hilum are considered abnormal only if the remainder of the scan is positive; and (6) new foci are considered positive only if the remainder of the scan is positive or a new lesion is focal, very intense, and associated with a lesion on CT.2  Scan interpretation was binary; the result could be “positive” or “negative.”

Three external nuclear medicine experts independently applied, without dedicated training, the ECOG study criteria as well as the London criteria to visually score every baseline lesion at midtreatment for the first 38 cases (76 scans) from the E3404 study. Neither the central reviewer nor the experts had access to any clinical information. The London criteria score scans 0 to 3 as “negative” if uptake is less than liver and 4 or 5 as “positive” for uptake that is moderately or markedly increased relative to liver.3  The 245 individual baseline lesions were identified by anatomic site and provided on a worksheet for the external experts, who applied the ECOG and London criteria to each lesion on interim PET. Each case was scored as negative or positive, and agreement among external experts was analyzed by Fleiss κ test to determine a P value for differences in proportion of positive scans. The κ statistic was used to correct for chance in the agreement among the external experts.

Results and discussion

The proportions of positive interim scans by reader were 16%, 34%, and 26%, (P = .206) for ECOG criteria and 16%, 34%, and 29% (P = .263) for London criteria, with only reader 3 scoring differently between criteria (Figure 1). With 3 experts scoring 38 interim scans (representing 1-25 baseline lesions per case), agreement was 68% for ECOG and 71% for London criteria. The κ statistic was 0.445 using ECOG criteria, indicating only moderate (typical range, 0.4-0.6) agreement per case, and 0.502 for the London criteria, also in the moderate range.

Figure 1

Proportion of interim-PET cases interpreted as positive by reader, according to the ECOG and London criteria. Error bar represents 1 SE for the proportion.

Figure 1

Proportion of interim-PET cases interpreted as positive by reader, according to the ECOG and London criteria. Error bar represents 1 SE for the proportion.

Table 1 details discordance among experts in 12 cases. Reviewer 2 was more likely to interpret interim PET scans as positive, reader 1 less likely, and reviewer 3 in between. In 5 cases, 2 reviewers considered the interim scan to be positive; and in 7 cases, a single reviewer considered the interim scan positive. Using ECOG criteria, there were 26 cases with complete agreement among experts, and these cases were also in complete agreement with the central review. Each expert considered a single case of residual bone disease positive using London criteria but negative by ECOG criteria.

Table 1

Twelve cases of expert reviewer disagreement in interim PET scans

Case no.Concordant lesions, nDiscordant lesions, nReviewer 1Reviewer 2Reviewer 3Consensus resolution
1 (PA) 
1 (Bo) 
1 (Sp) 
1 (PA) 
1 (PA) 
1 (PA) 
1 (Ce) Y* 
1 (Bo) Y* 
1 (IL) 
10 15 1 (Sc) Y 
11 1 (Sp) 
12 1 (Bo) 
Case no.Concordant lesions, nDiscordant lesions, nReviewer 1Reviewer 2Reviewer 3Consensus resolution
1 (PA) 
1 (Bo) 
1 (Sp) 
1 (PA) 
1 (PA) 
1 (PA) 
1 (Ce) Y* 
1 (Bo) Y* 
1 (IL) 
10 15 1 (Sc) Y 
11 1 (Sp) 
12 1 (Bo) 

0 indicates negative scan; +, positive scan; PA, para-aortic; Bo, bone; Sp, spleen; Ce, cervical; IL, iliac; and Sc, supraclavicular.

*

Consensus “negative.”

Consensus “positive.”

Among the 12 discordant cases, the number of baseline nodal sites ranged from 0 to 16 (median, 5), and 5 cases had extranodal sites at baseline. A single site of disagreement was observed in each case, including para-aortic nodes (n = 5), bone (n = 3), spleen (n = 2), and 1 each of iliac/inguinal and supraclavicular nodes. A definitive CT correlate was present in 1 case, absent in 8 cases, debatable in 2 cases; CT was not available in 1 case. After independent review, the 12 cases were reviewed together to determine whether consensus could be achieved. There was agreement in 3 cases, with 2 cases becoming negative and the other considered positive (Table 1).

The fact that agreement of midtreatment PET among expert nuclear medicine physicians using standardized criteria was only moderate on a per-case basis has important implications as decisions are being made regarding treatment efficacy in practice as well as in clinical trials. More recently, some investigators have raised concern about the false-positive rate of interim PET in modern DLBCL treatment, which includes rituximab with its long half-life and unique mechanisms of cytotoxicity, use of dose-dense chemotherapy with scans obtained within 2 weeks of treatment, and use of granulocyte colony-stimulating factor.4-6  Indeed, the positive predictive value of interim PET scans appears to be lower in the current therapeutic era (∼ 60%) versus the prior 80% likelihood of failure with chemotherapy alone.7  The predictive value of interim PET+ scans has been positively correlated with the international prognostic index and with the international working classification response criteria.8,9  Equivocal or indeterminate dictated reports of interim PET scans, which pose challenges for clinicians, appear to predict treatment success rather than failure.6,8  The literature is inconsistent with regard to the predictive value of PET scans at the conclusion of R-CHOP, suggesting real differences in interpretation.5,6  Lin et al have proposed that changes in standard uptake value may improve the predictive accuracy of interim FDG-PET.10  In sum, the broader use of interim PET scans in the modern therapeutic era has not reproduced the dichotomous results previously reported, although progression-free survival is generally consistently inferior for interim PET+ patients.11,12 

Using our study criteria, the proportion of PET+ scans was relatively low, and the current report relates solely to the reproducibility of interpretation using standardized criteria. Agreement among external experts would probably have been higher if the study had been preceded by a training exercise using the 2 study criteria, neither of which is well validated (no such criteria exist for interim PET scan). It is interesting that there was essentially no difference in agreement with either ECOG or the London criteria, which are being applied in a phase 3 Hodgkin lymphoma trial.3  Our results indicate that, among multiple involved sites at diagnosis, single sites, particularly para-aortic, spleen, and bone, were the source of disagreement on interim PET, and CT correlates of residual positive sites were frequently absent or debatable. We conclude that greater harmonization of PET interpretation is indicated for research and practice, and this will require training of nuclear physicians using consistent, validated interpretive criteria and standardized reporting.

An Inside Blood analysis of this article appears at the front of this issue.

Presented in part at the 50th annual meeting of the American Society of Hematology, San Francisco, CA, December 8, 2008.

The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

Acknowledgments

The authors thank John Allen and Patrick Pringle, supported by Stanford University, who provided technical and data management support.

This work was supported by the Eastern Cooperative Oncology Group Research and Foundation. The E3404 clinical trial study was conducted by the Eastern Cooperative Oncology Group (Dr Robert L. Comis, Chair) and supported in part by the National Cancer Institute, National Institutes of Health and the Department of Health and Human Services (Public Health Service grants CA21115, CA23318, CA66636, CA13650, and CA16116). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the National Cancer Institute.

National Institutes of Health

Authorship

Contribution: S.J.H. designed research, collected, analyzed, and interpreted data, and wrote the manuscript; M.E.J., H.S. and G.W. performed research and participated in analysis and interpretation of data and manuscript preparation; A.M. conducted statistical analysis and participated in interpretation of data and manuscript preparation; L.J.S. was the principal investigator of the clinical trial, facilitated central review of PET scans, and participated in manuscript review; R.A. was the coprincipal investigator of the clinical trial and participated in manuscript review; R.G. reviewed diagnostic pathology for the clinical trial and participated in manuscript review; and A.Q. designed and performed research, provided central PET review for the clinical trial, and participated in analysis and interpretation of data and manuscript preparation.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Sandra J. Horning, 875 Blake Wilbur Dr, Suite 2338, Stanford, CA 94304; e-mail: sandra.horning@stanford.edu.

References

References
1
Kasamon
 
YL
Jones
 
RJ
Wahl
 
RL
Integrating PET and PET/CT into the risk-adapted therapy of lymphoma.
J Nucl Med
2007
, vol. 
48
 
Suppl 1
(pg. 
19S
-
27S
)
2
Juweid
 
ME
Stroobants
 
S
Hoekstra
 
OS
, et al. 
Use of positron emission tomography for response assessment of lymphoma: consensus of the Imaging Subcommittee of International Harmonization Project in Lymphoma.
J Clin Oncol
2007
, vol. 
25
 
5
(pg. 
571
-
578
)
3
Radford
 
J
O'Doherty
 
M
Barrington
 
S
, et al. 
Results of the 2nd Planned Interim Analysis of the RAPID Trial (involved field radiotherapy versus no further treatment) in patients with clinical stages 1A and 2A Hodgkin lymphoma and a “negative” FDG-PET scan after 3 cycles ABVD [abstract].
Blood
2008
, vol. 
112
 
11
pg. 
369
 
4
Moskowitz
 
C
Hamlin
 
PA
Horwitz
 
SM
, et al. 
Phase II trial of dose-dense R-CHOP followed by risk-adapted consolidation with either ICE or ICE and ASCT, based upon the results of biopsy confirmed abnormal interim restaging PET scan, improves outcome in patients with advanced stage DLBCL.
Blood
2006
, vol. 
108
 
11
pg. 
532
 
5
Han
 
HS
Escalon
 
MP
Hsiao
 
B
Serafini
 
A
Lossos
 
IS
High incidence of false-positive PET scans in patients with aggressive non-Hodgkin's lymphoma treated with rituximab-containing regimens.
Ann Oncol
2009
, vol. 
20
 
2
(pg. 
309
-
318
)
6
Cashen
 
A
Dehdashti
 
F
Luo
 
J
Bartlett
 
NL
Poor predictive value of FDG-PET/CT performed after 2 cycles of R-CHOP in patients with diffuse large B-cell lymphoma (DLCL).
Blood
2008
, vol. 
112
 
11
pg. 
371
 
7
Haioun
 
C
Itti
 
E
Rahmouni
 
A
, et al. 
[18F]fluoro-2-deoxy-D-glucose positron emission tomography (FDG-PET) in aggressive lymphoma: an early prognostic tool for predicting patient outcome.
Blood
2005
, vol. 
106
 
4
(pg. 
1376
-
1381
)
8
Thomas
 
A
Gingrich
 
R
Smith
 
BJ
Jacobus
 
LS
Habermann
 
TM
Link
 
BK
FDG-PET as predictor of outcome in diffuse large B-cell lymphoma (DLBCL): first analysis of “indeterminate” reports.
Proc Am Soc Clin Oncol
2008
, vol. 
26
 
suppl 15
pg. 
8510
 
9
Dupuis
 
J
Itti
 
E
Rahmouni
 
A
, et al. 
Response assessment after an inductive CHOP or CHOP-like regimen with or without rituximab in 103 patients with diffuse large B-cell lymphoma: integrating 18fluorodeoxyglucose positron emission tomography to the International Workshop Criteria.
Ann Oncol
2009
, vol. 
20
 
3
(pg. 
503
-
507
)
10
Lin
 
C
Itti
 
E
Haioun
 
C
, et al. 
Early 18F-FDG PET for prediction of prognosis in patients with diffuse large B-cell lymphoma: SUV-based assessment versus visual analysis.
J Nucl Med
2007
, vol. 
48
 
10
(pg. 
1626
-
1632
)
11
Mikhaeel
 
NG
Hutchings
 
M
Fields
 
PA
O'Doherty
 
MJ
Timothy
 
AR
FDG-PET after two to three cycles of chemotherapy predicts progression-free and overall survival in high-grade non-Hodgkin lymphoma.
Ann Oncol
2005
, vol. 
16
 
9
(pg. 
1514
-
1523
)
12
Spaepen
 
K
Stroobants
 
S
Dupont
 
P
, et al. 
Early restaging positron emission tomography with (18)F-fluorodeoxyglucose predicts outcome in patients with aggressive non-Hodgkin's lymphoma.
Ann Oncol
2002
, vol. 
13
 
9
(pg. 
1356
-
1363
)