The prognosis of follicular lymphomas (FL) is heterogeneous and numerous treatments may be proposed. A validated prognostic index (PI) would help in evaluating and choosing these treatments. Characteristics at diagnosis were collected from 4167 patients with FL diagnosed between 1985 and 1992. Univariate and multivariate analyses were used to propose a PI. This index was then tested on 919 patients. Five adverse prognostic factors were selected: age (> 60 years vs ≤ 60 years), Ann Arbor stage (III-IV vs I-II), hemoglobin level (< 120 g/L vs ≥ 120 g/L), number of nodal areas (> 4 vs ≤ 4), and serum LDH level (above normal vs normal or below). Three risk groups were defined: low risk (0-1 adverse factor, 36% of patients), intermediate risk (2 factors, 37% of patients, hazard ratio [HR] of 2.3), and poor risk (≥ 3 adverse factors, 27% of patients, HR = 4.3). This Follicular Lymphoma International Prognostic Index (FLIPI) appeared more discriminant than the International Prognostic Index proposed for aggressive non-Hodgkin lymphomas. Results were very similar in the confirmation group. The FLIPI may be used for improving treatment choices, comparing clinical trials, and designing studies to evaluate new treatments.
Follicular lymphomas (FLs) account for one third of non-Hodgkin lymphomas (NHLs) in adults. The course of the disease is usually characterized by a response to initial treatment, followed by relapses, sometimes associated with histologic transformation into high-grade NHL.1 From “watchful waiting” to high-dose therapy, numerous treatment options have been proposed for patients with FL. Meanwhile, there is no consensus on any of these approaches. Agreement in the treatment algorithm of patients with FL would be made easier by a simple, validated, and accurate prognostic index similar to the International Prognostic Index proposed for aggressive NHLs in 1993.2 In retrospective analyses of a series of FL or indolent NHLs, several characteristics were associated with a poor clinical outcome such as advanced age,3-10 male sex,4,7,11,12 disseminated disease according to Ann Arbor classification,1,4 high number of nodal3,7 and/or extra nodal involvement sites,12 presence of bulky tumor(s),6,8 increased serum lactate dehydrogenase (LDH)13 and/or β2 microglobulin14 levels, poor performance status,5,6 and a low hemoglobin level.4 From these analyses, a few prognostic indices have been proposed3,4,9,12 but none of them has been validated and/or widely used. Several retrospective analyses have also suggested that the International Prognostic Index (IPI) initially designed for aggressive NHLs could also be used in indolent NHLs.15-20 However, some important prognostic factors may have been missed since the IPI was not designed to investigate prognostic factors in FL. Moreover, when using the IPI, very few patients (around 10%-15%) with FL are classified in the poor-risk category. Because of this, the IPI is not appropriate to identify patients in whom intensive therapy has to be tested. An international cooperative study was thus designed to collect the data on initial characteristics of a large number of patients with FL and to propose a prognostic index for FL. This cooperative study culminated in a proposal for a Follicular Lymphoma International Prognostic Index (FLIPI).
Patients and methods
The following inclusion criteria were used: (1) Follicular lymphoma according to the Working Formulation for Clinical Usage21 and/or the Kiel classification,22 which were in use at the time of the period of inclusion. All cell types (small-cell, mixed, or large-cell FL) could be included in the study. No central pathology review was performed. (2) Initial diagnosis between January 1, 1985, and December 31, 1992. (3) Staging procedures including at least a CT scan of the thorax, abdomen and pelvis, or lymphangiography plus abdominal and pelvis echography, bone marrow biopsy, routine blood counts, and biochemistry tests. (4) Follow-up until death, or for at least 5 years for surviving patients. The FLIPI was a retrospective study that relied on patients included in several trials conducted according to legal guidelines in each country at the time of study. Consent for this study was part of the informed consent given for these trials. The study was approved by the French Committee for the Use of Computerized Medical Data.
Demographic characteristics and initial staging. Nodal areas considered were cervical, axillary, inguino-crural, para aortic and/or iliac, celiac and/or mesenteric, and other ancillary nodal sites. Involved area (or areas) either clinically or on CT scan (or scans) was quoted as 1 (2 if bilateral) and each patient had between 0 and 8 or more involved areas (Figure 1). All extra nodal areas were taken into account. In the absence of any agreement on a threshold, it was not possible to define a bulky tumor. As in the International Prognostic Index for aggressive NHLs2 the spleen was considered as an extra nodal site.
Clinical and biologic characteristics. The following clinical and biologic characteristics were related to disease extension and/or tumor bulk: cell type, Ann Arbor stage, serum LDH, and β2 microglobulin levels (expressed as the ratio of the measured value to the upper limit of normal for the center). The following clinical and biologic characteristics were related to the effects of FL on the host: performance status according to the Eastern Cooperative Oncology Group scale, presence or absence of any B symptoms, anemia, lymphocytopenia, decreased serum albumin level, increased erythrocyte sedimentation rate (ESR), thrombocytopenia.
Overall survival was the end point of all statistical analyses. Survival rates and corresponding standard errors were estimated using Kaplan and Meier estimators.23 Survival curves were compared applying the log-rank test. Continuous biologic variables were dichotomized applying usual clinical thresholds. These a priori chosen thresholds were checked using cubic smoothing spline24 and the risk function of a proportional hazard model.25 A prognostic model was built fitting a proportional hazard model with all variables that significantly influenced the overall survival at a level of P values less than or equal to .05 in the univariate analysis (full model). A forward stepwise Cox regression analysis25 was then performed, including age and sex and, successively, extent of the disease, influence of the disease on the host, and other biologic variables. The prognostic index was derived from the prognostic model resulting from the Cox analysis. The clinical committee of the project asked for an index that would include no more than 5 variables in order to make its use easier in routine practice. If the Cox analysis retained more than 5 variables, it was decided to select the 5 variables from the prognostic model that produced the smallest loss of discriminating power. For choosing the most accurate model, all the candidates with 4 variable models other than age were classified according to 2 criteria: (1) score tests, evaluated on 100 resamples of the original data set; and (2) the Somer D coefficient adapted from Harrell et al26 for measuring concordance of observed and expected survival, with correction of optimism using the bootstrap technique. Risk groups were defined by comparing the relative risk of death in patients with each possible number of presenting risk factors (from 0 to 5). Then, categories were combined according to the number of patients within each category, the combination producing the smallest loss of information in terms of log-likelihood, and clinical consideration in order to obtain 3 categories of approximately equal size.
Inclusion criteria for external validation were similar to those of the initial study, with 2 specificities: diagnosis after January 1993, and availability of the information on the 5 parameters of the FLIPI.
Overall, 5120 patients from 27 centers or groups have been registered (Table 1). There were 953 who were not included for various reasons, including date of diagnosis not between 1985 and 1992 (45%), insufficient follow-up (36%), incomplete data (8%), and other reasons (11%). There were 4167 cases included in the final analysis. The median follow-up of surviving patients was 7.5 years and the overall survival of these patients is shown in Figure 2. The main clinical characteristics are shown in Table 2. Treatment modalities varied over time and according to the institutions.
The correlations between the clinical characteristics at diagnosis and overall survival are shown in Table 2. Given the size of the study population, all the listed characteristics (except cell type) were significantly associated with outcome. However, in order to propose a simple and accurate index, the clinical and statistical committees decided not to include all of these parameters in the multivariate analysis. The following parameters were not included: ESR, because this parameter was only measured in European patients, and not in those from the United States; ECOG performance status (PS), because the number of patients with a poor PS (ECOG > 1) was low (12%) and because there was an unexplained difference in the percentage of patients with a poor PS between European (14.5%) and US (2.1%) centers; serum β2 microglobulin level and serum albumin level because of the very high proportion of patients with missing data.
Based on clinical relevance and availability of the information, 12 pretreatment characteristics were included in the multivariate analysis (sex, age group, Ann Arbor stage, bone marrow involvement, splenic involvement, number of nodal areas involved, number of extra nodal sites other than bone marrow, B symptoms, anemia, lymphocytopenia, thrombocytopenia, and serum LDH level). Both complete model and forward analyses retained 8 variables independently associated with the prognosis in a model established on 1795 patients (Figure 3) for whom these parameters were available (Table 3).
This sample of 1795 patients comprised the population used to build the FLIPI. Both methods retained the same 5-variable submodel: age (≥ 60 years vs < 60 years), Ann Arbor stage (III-IV vs I-II), hemoglobin level (< 120 g/L vs ≥ 120 g/L), number of nodal areas involved (> 4 vs ≤ 4), serum LDH level (above normal vs normal or below; Table 4). In the 100 resamples of the original data set, this 5-parameter model was classified 24 times with the best score and 59 times as one of the 3 highest scoring models. In terms of individual prediction, this model was also the closest, as measured by the D coefficient,26 to the 8-parameter model.
Patients with a score of 5 were combined with patients with a score of 4 because the former were too rare to constitute a category. Patients with scores of 0 and 1 were combined because both correspond to a group with a very good prognosis. Combining patients with a score of 3 with those having a score of 4 or 5 yielded the smallest log-likelihood change. The FLIPI index was thus created with 3 risk groups: low (0-1 risk factor), intermediate (2 risk factors), high (≥ 3 risk factors). The distribution of patients into these 3 groups and hazard ratios are shown in Table 5. The survival curves are shown in Figure 4.
Comparison with the International Prognostic Index (IPI)
This comparison was performed on 1647 of 1795 patients used for building the FLIPI for whom complete information was also available for the parameters of the IPI (age, serum LDH level, performance status, Ann Arbor stage, number of extra nodal sites of disease).
The distribution of patients into the 4 IPI risk groups and the relative risks of death are shown in Table 6. The IPI separates the patients into 4 risk groups with significantly different survivals. Meanwhile, the number of patients in “high” and “high-intermediate” risk groups is low (4.7% and 15.5%, respectively). Conversely, most of the patients are in the “low” and “low-intermediate” risk groups (49% and 31%, respectively). As shown in Figure 5, the FLIPI was discriminant as well as in patients with low risk (P = .001), intermediate risk (P = .001), and high-intermediate and high-risk (P = .014) according to the IPI.
The FLIPI was also tested in patients younger than 60 years and in patients 60 years or older. As in the IPI study,2 the 4 risk factors other than age were tested within each age group. The 4 other identified risk factors (number of nodal sites, Ann Arbor stage, serum LDH level, and hemoglobin level) remained independent prognostic factors. Survival curves for these 2 age groups are shown in Figure 6.
The data of 1101 other cases of patients with FL were received from 10 groups or centers in the United States and Europe. Of these, 92 were not analyzed because of missing values and/or inconsistencies. Overall, 919 cases (83.5%) were included in the analysis. The median follow-up was 6.8 years. The distribution of the 5 parameters of the FLIPI among these 919 patients, the distribution among the 3 FLIPI groups, and the hazard ratios are shown in Table 7. Survival curves are shown in Figure 7.
Serum β2 microglobulin (β2 M) level was measured in a greater number of cases (65%) at the time of diagnosis for this group of patients and thus could be studied as a factor that could potentially add information to the FLIPI. Serum β2 M was normal in 65% of patients and increased above the upper limit of normal in 35% of patients. Survival curve analysis showed that there was no difference between patients with normal β2 M or increased β2 M within each FLIPI subgroup (data not shown).
Among all NHLs, follicular lymphomas are the second most frequent subtype. Unfortunately, there is no truly effective therapy for FL, and its prognosis has remained basically unchanged over the last 30 years.1 However, several new treatment modalities including combination of chemotherapy and interferon alpha,27 anti-CD20 monoclonal antibodies given alone28 or bound to a radio nuclide,29 intensive therapy with autologous stem cell transplantation,30 or nonmyeloablative allogenic stem cell transplantation31 have recently shown their activity in clinical trials. These treatments have significant toxicities and are costly. To better define the patients in whom these therapies are warranted, a prognostic index would be very helpful.
From a large and multicentric database of patients, we were able to propose and to validate a prognostic index for follicular lymphomas, the FLIPI. Although inclusion criteria did not define age limits, the median age was 56 years with 37% of patients older than 60 years. This median age, possibly lower than that of all patients with FL, may be related to the fact that most patients were registered by groups and included in clinical trials (Table 1). However, this has probably no influence on results. This index includes parameters related to patient characteristics (age), tumor burden (Ann Arbor stage, number of nodal sites), tumor aggressiveness (serum LDH level), and consequences of the lymphoma on the host (hemoglobin level). Using this index, 3 risk groups of approximately the same size (36%, 37%, and 27%) have been separated. There is clearly a difference in survival between each of these risk groups. An external validation on another group of 919 patients with FL showed a very similar distribution of patients, highly significant differences in overall survival, and similar hazard ratios between the 3 FLIPI subgroups. This external validation confirms the reproducibility of the FLIPI analysis.
All the parameters of the FLIPI have been found to significantly influence prognosis in several other analyses3-20 and have been included in other prognostic indices.3,4,9,12 These parameters have been routinely evaluated in the initial staging of patients with FL for many years. This will allow the comparison of the distribution of patients and the survival curves of many other series' with those reported herein and will further evaluate the accuracy of the FLIPI. Treatment was not included in the prognostic analysis, which concerned only initial characteristics. However, although treatments were heterogeneous, none of the treatments given during the period of inclusion has significantly changed the natural history of the disease.1
The number of prognostic factors used to build this index was deliberately limited in order to obtain a simple and accurate index. The concordance in discriminatory power between the training and confirmation groups demonstrates the accuracy of the FLIPI. An additional advantage of the FLIPI is that it can be used irrespective of age group.
The FLIPI may be used for selecting treatment in individual patients. In patients with a good prognosis (0-1 adverse factor), the 10-year overall survival is 71%. This indicates that optimal treatment in these patients has to avoid toxicity and to preserve quality of life. Involved-field radiation therapy for patients with limited disease and an initial “no treatment policy,” for patients with disseminated disease may be recommended outside clinical trials. In contrast, patients with high-risk FL have a median survival around 5 years. Innovative approaches such as the combination of CVP (cyclophosphamide, vincristine, prednisone) or CHOP (CVP plus doxorubicin) and anti-CD20 monoclonal antibody,32 purine analog-based regimens,33 and autologous stem cell transplantation30 followed by vaccine therapies34 may be studied in this subgroup. All these approaches have been so far evaluated in phase 2 studies. The size of the high-risk group (27% of patients in the sample used for creating this index and 28% in the sample used for validation) could allow the design of multicenter randomized trials.
In conclusion, the FLIPI is an extremely simple and reproducible prognostic index, based on easily available clinical data, for patients with FL. This index may be a useful tool for improving the prognostic assessment of patients with FL. It can also be of help in selecting the most appropriate treatment in individual patients and in stratifying patients in prospective trials.
Prepublished online as Blood First Edition Paper, May 4, 2004; DOI 10.1182/blood-2003-12-4434.
Supported by the Ministry of Health (France), the Ministry of Health and Research (Spain), the Groupe d'Etude des Lymphomes de l'Adulte (France and Belgium), the Groupe Ouest-Est des Leucémies Aiguës et Autres Maladies du Sang (France), the Nebraska Lymphoma Study Group, Cancer Research Switzerland grant KFS 00792-2-1999, and the following pharmaceutical companies, in alphabetical order: Amgen France (Neuilly-sur-Seine, France), Idec Pharmaceuticals (San Diego, CA), Produits Roche (Neuilly-sur-Seine, France), Schering AG (Lyz-les-Lannoy, France, and Madrid, Spain), and Schering-Plough (Kennilworth, NJ).
An Inside Blood analysis of this article appears in the front of this issue.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 U.S.C. section 1734.
The authors would like to acknowledge the following people who participated in the study (alphabetical order): A. Altès, M. Bast, P. Biermann, A. Bosly, M. Caniard, F. Cavalli, J. Estève, M. Ferrandon, G. Follea, D. Harrington, M. Hess, S. Houga, R. Liang, C. Linassier, R. M. Livet, J. Matthews, N. Milpied, R. J. Prescott, L. Remontet, F. Reyes, B. Riche, J. Vose, and H. Wotherspoon.