Key Points
A recent trial showed increased risk of death or bleeding in neonates who received platelet transfusions for platelet counts above 25 × 109/L.
The current analysis reveals that these harmful effects occur in neonates with high, as well as low, baseline risk of death or bleeding.
Abstract
The Platelets for Neonatal Thrombocytopenia (PlaNeT-2) trial reported an unexpected overall benefit of a prophylactic platelet transfusion threshold of 25 × 109/L compared with 50 × 109/L for major bleeding and/or mortality in preterm neonates (7% absolute-risk reduction). However, some neonates in the trial may have experienced little benefit or even harm from the 25 × 109/L threshold. We wanted to assess this heterogeneity of treatment effect in the PlaNet-2 trial, to investigate whether all preterm neonates benefit from the low threshold. We developed a multivariate logistic regression model in the PlaNet-2 data to predict baseline risk of major bleeding and/or mortality for all 653 neonates. We then ranked the neonates based on their predicted baseline risk and categorized them into 4 risk quartiles. Within these quartiles, we assessed absolute-risk difference between the 50 × 109/L- and 25 × 109/L-threshold groups. A total of 146 neonates died or developed major bleeding. The internally validated C-statistic of the model was 0.63 (95% confidence interval, 0.58-0.68). The 25 × 109/L threshold was associated with absolute-risk reduction in all risk groups, varying from 4.9% in the lowest risk group to 12.3% in the highest risk group. These results suggest that a 25 × 109/L prophylactic platelet count threshold can be adopted in all preterm neonates, irrespective of predicted baseline outcome risk. Future studies are needed to improve the predictive accuracy of the baseline risk model. This trial was registered at www.isrctn.com as #ISRCTN87736839.
Introduction
Preterm neonates with severe thrombocytopenia (platelet count, <50 × 109/L) are often treated with prophylactic platelet transfusions, despite lack of evidence of their efficacy.1,2 In the recently published Platelets for Neonatal Thrombocytopenia (PlaNeT-2) trial, a platelet count threshold of 50 × 109/L for prophylactic platelet transfusion increased the risk of a composite major outcome of bleeding and/or death, when compared with a lower threshold of 25 × 109/L (odds ratio [OR], 1.57; 95% confidence interval [CI], 1.06-2.32).3 It is likely, however, that there was heterogeneity of treatment effect, with some neonates benefitting more, some less, and some not at all (or even harmed), from using the lower transfusion threshold.
Heterogeneity of treatment effect is caused by differences between patients in a trial. Patients with different risk factors experience a different effect of the treatment. For example, a 2-day-old preterm neonate on mechanical ventilation, with thrombocytopenia, intrauterine growth retardation, and sepsis, may have a 20% risk of bleeding, and a 3-week-old neonate with thrombocytopenia without other risk factors may have a near 0% risk of bleeding.4 The prophylactic effect of a platelet transfusion is probably substantial in the first neonate and absent in the second, which induces the so-called heterogeneity of treatment effect in a trial in which both types of patients are included. Another explanation of the heterogeneity of treatment effect is the distribution of baseline risk in a trial population. Baseline risk in a trial population is usually not normally distributed. It has been shown that, in most trials, the outcomes occur in a relatively small number of high-risk patients, whereas most patients are at much lower than average baseline risk.5-7 This is unfortunate because the overall trial result holds only for patients with an average baseline risk. The exploration of heterogeneity of treatment effect will show whether the overall trial result holds true for patients with different baseline risks. Thus, before using the results of the PlaNeT-2 trial to revise guidelines and to guide treatment decisions in clinical practice, the presence and extent of heterogeneity of treatment effect must be evaluated.
Therefore, the objective of our study was to explore the heterogeneity of treatment effect in the PlaNet-2 trial, to assess whether there are specific groups of neonates who do or do not benefit from a low-platelet-count threshold for transfusion.
Methods
We used data from the PlaNet-2 trial (N = 660), a multicenter clinical trial with randomized treatment group assignment, open-label treatment, and open-label end point evaluation. A prophylactic platelet transfusion threshold of 25 × 109/L was compared with a threshold of 50 × 109/L in neonates with severe thrombocytopenia, defined as a platelet count <50 × 109/L. The primary outcome was a composite of mortality and/or major bleeding within 28 days of randomization. Neonates were randomized from June 2011 through August 2017 in 43 neonatal intensive care units in the United Kingdom, The Netherlands, and Ireland. Definitions of major bleeding and other relevant details can be found in the protocol and report, as published elsewhere.3,8
The PlaNeT-2 trial protocol was approved by independent ethics committees in the United Kingdom, The Netherlands, and Ireland. The trial was conducted by the PlaNeT-2 and Managing Thrombocyte Transfusions in a Special Subgroup: Neonates (MATISSE) collaborators. Parents or caretakers of all study participants gave written informed consent. The current study was conducted in accordance with the Declaration of Helsinki and reported according to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis guidelines.9
Variable selection
A limited number of variables was selected before data analysis, a recommended strategy to avoid overfitting.10,11 The following categorical variables were chosen, as they were considered important predictors of the outcome: gestational age at birth, <28 weeks; postnatal age at randomization, <3 days, 3 to 7 days, or >7 days; intrauterine growth retardation (IUGR); necrotizing enterocolitis (NEC) at randomization; sepsis at randomization; administration of antenatal steroids; and previous major bleeding. We defined IUGR as a birth weight in less than the 10th percentile in conjunction with an estimated fetal weight that crosses percentiles downward during pregnancy, ultrasonographic evidence of uteroplacental insufficiency, or both. We defined NEC as stage 2A or higher, according to modified Bell’s staging criteria,12 and sepsis as a culture-positive or culture-negative sepsis where a course of at least 5 days of antibiotics was administered for proven or clinically suspected sepsis. Antenatal steroids were included as a dichotomous variable: steroids were or were not administered (irrespective of whether a full or partial course was given). We included gestational and postnatal age as continuous variables in our model. In addition, we added sex to the model, as it is frequently included in existing prediction models for major bleeding or mortality (supplemental Table 1, available on the Blood Web site). Treatment assignment was added to correct for any imbalance between the treatment groups, despite randomization. The assignment did not affect the calculation of baseline risk, because predicted baseline risk and cutoff points for the quartiles of these risks were calculated assuming the low-threshold group (25 × 109/L) for all neonates.
Coding of variables
We dealt with missing data using single imputation, as the number of missing values was low (Table 1). We used median values for continuous and modal values (value that occurs most often) for categorical variables. We allowed for nonlinearity of continuous variables by restricted cubic spline functions, with 2 degrees of freedom.13
. | Total cohort (N = 653) . | Major bleed/death (n = 146) . | No major bleed/death (n = 507) . |
---|---|---|---|
Gestational age, median wk (IQR)* | 26.7 (24.9-28.7) | 27.0 (25.0-29.0) | 26.0 (24.6-28.0) |
Postnatal age, median d (IQR)† | 7.5 (3.9-20.5) | 8.7 (3.9-19.1) | 7.0 (3.8-21.2) |
Male, n (%)‡ | 394 (60.3) | 88 (60.3) | 306 (60.4) |
Intrauterine growth retardation, n (%) | 243 (37.2) | 46 (31.5) | 197 (38.9) |
Antenatal corticosteroids, n (%)§ | 580 (88.8) | 127 (87.0) | 453 (89.3) |
Sepsis, n (%) | 412 (63.1) | 102 (69.9) | 310 (61.1) |
Necrotizing enterocolitis, n (%) | 107 (16.4) | 31 (21.2) | 76 (15.0) |
Previous major bleeding, n (%) | 120 (18.4) | 38 (26.0) | 82 (16.2) |
. | Total cohort (N = 653) . | Major bleed/death (n = 146) . | No major bleed/death (n = 507) . |
---|---|---|---|
Gestational age, median wk (IQR)* | 26.7 (24.9-28.7) | 27.0 (25.0-29.0) | 26.0 (24.6-28.0) |
Postnatal age, median d (IQR)† | 7.5 (3.9-20.5) | 8.7 (3.9-19.1) | 7.0 (3.8-21.2) |
Male, n (%)‡ | 394 (60.3) | 88 (60.3) | 306 (60.4) |
Intrauterine growth retardation, n (%) | 243 (37.2) | 46 (31.5) | 197 (38.9) |
Antenatal corticosteroids, n (%)§ | 580 (88.8) | 127 (87.0) | 453 (89.3) |
Sepsis, n (%) | 412 (63.1) | 102 (69.9) | 310 (61.1) |
Necrotizing enterocolitis, n (%) | 107 (16.4) | 31 (21.2) | 76 (15.0) |
Previous major bleeding, n (%) | 120 (18.4) | 38 (26.0) | 82 (16.2) |
IQR, interquartile range.
In 5 cases, the exact gestational age could not be determined because of uncontrolled pregnancies. It was estimated in full weeks.
Missing data in 2 cases, single imputation using the median.
Missing data in 1 case, single imputation using the mode.
Missing data in 4 cases, single imputation using the mode.
Model specification and estimation
We developed a logistic regression model to predict baseline risk of outcome. Baseline risk was calculated assuming the low-threshold assignment (25 × 109/L) for all neonates. No treatment interactions were added to the model, as no prior evidence of strong interaction effects was available, and our sample size was limited.14 We used a full model approach: variables were selected a priori and were not removed from the model based on statistical significance. All analyses were performed in R, version 3.3.x.
Model performance and validation
We expressed model performance as discrimination and calibration. Discrimination was quantified using the C-statistic (equal to the area under the receiver operating characteristic curve for dichotomous outcomes). The C-statistic estimates the probability that, of 2 randomly chosen patients, the patient with the outcome had a higher predicted probability of a major bleed and/or mortality than the patient without the outcome. We internally validated the C-statistic with a bootstrap procedure to correct for the optimism caused by using the same data for development and validation of the model. Calibration refers to the agreement between predicted and observed risks and was assessed graphically with a validation plot.
Assessment of heterogeneity of treatment effect
We predicted baseline risk of outcome for all neonates using the logistic regression model. We then ranked the neonates based on their predicted risk and categorized them into 4 risk quartiles (very low, low, moderate, and high risk). Within these quartiles, we assessed absolute-risk differences between the high- and low-threshold group. We presented the absolute-risk difference and confidence interval for each risk group.
We examined heterogeneity of treatment effect on an absolute scale, because this is generally considered to be more clinically relevant than a relative scale.15,16 For example, an absolute-risk difference of 5% may be clinically relevant, even though the relative risk difference (eg, OR) is minimal. On the contrary, a 2-fold increase in relative risk is most likely clinically insignificant when baseline risk is extremely low (eg, absolute risk increases from 0.01% to 0.02%).
Results
A total of 660 neonates were randomized in the PlaNeT-2 trial, with a median gestational age at birth of 26.7 weeks and median postnatal age at randomization of 7.5 days. Baseline characteristics of the study population are presented in Table 1. Seven neonates were either randomized in error, or primary outcome data were missing and could not be inferred, leaving a total of 653 neonates for analysis. Neonatal death and/or major bleeding occurred in 19% of neonates in the 25 × 109/L-threshold group and 26% of neonates in the 50 × 109/L-threshold group: absolute-risk difference, 7%. This corresponds to a number needed to treat of 14, indicating that for on average every 14 neonates transfused according to the 25 × 109/L threshold, 1 major bleed or death was prevented, compared with transfusing all 14 neonates at 50 × 109/L. More details on baseline characteristics and outcome descriptions can be found in the original report of the trial.3
In our baseline risk prediction model, presence of a previous major bleed, lower gestational age and treatment assignment to the 50 × 109/L-threshold group were independently associated with increased risk of outcome (Table 2). Postnatal age, antenatal corticosteroids, IUGR, female sex, and sepsis were not significantly associated with the outcome. The internally validated C-statistic was 0.63 (95% CI, 0.58-0.68). Figure 1 shows that the model performed well in each of the 4 quartiles of predicted baseline risk (very low, <13%; low, 13%-16%; intermediate, 17%-24%; and high, >24%). The 4 triangles representing the observed incidence of outcome in the 4 quartiles of baseline risk (very low, low, moderate, and high) approximate the diagonal line, in which the predicted baseline risks and observed incidences would be identical.
. | OR . | 95% CI . | P . |
---|---|---|---|
Gestational age (d)* | 0.58 | 0.40-0.84 | .004 |
Postnatal age (d)* | 0.68 | 0.40-1.16 | .155 |
Antenatal corticosteroids | 0.76 | 0.41-1.39 | .373 |
Intrauterine growth retardation | 0.95 | 0.60-1.50 | .825 |
Female sex | 0.96 | 0.65-1.43 | .848 |
Sepsis | 1.32 | 0.86-2.02 | .204 |
Treatment (50 × 109/L threshold) | 1.58 | 1.08-2.33 | .019 |
Necrotizing enterocolitis | 1.62 | 0.95-2.78 | .079 |
Previous major bleed | 1.63 | 1.00-2.65 | .049 |
. | OR . | 95% CI . | P . |
---|---|---|---|
Gestational age (d)* | 0.58 | 0.40-0.84 | .004 |
Postnatal age (d)* | 0.68 | 0.40-1.16 | .155 |
Antenatal corticosteroids | 0.76 | 0.41-1.39 | .373 |
Intrauterine growth retardation | 0.95 | 0.60-1.50 | .825 |
Female sex | 0.96 | 0.65-1.43 | .848 |
Sepsis | 1.32 | 0.86-2.02 | .204 |
Treatment (50 × 109/L threshold) | 1.58 | 1.08-2.33 | .019 |
Necrotizing enterocolitis | 1.62 | 0.95-2.78 | .079 |
Previous major bleed | 1.63 | 1.00-2.65 | .049 |
N = 653.
Interquartile range OR.
Figure 2 presents the distribution of the predicted baseline risks in the study population. Cutoff points separating the 4 predicted baseline risk quartiles were 12.6%, 17.4%, and 24.1%.
Figure 3 shows incidences of major bleeding and/or death, ORs, and risk differences, comparing patients in the 25 × 109/L- and the 50 × 109/L-intervention arm for each of the 4 quartiles of baseline risk. Figure 3A shows that, in all 4 quartiles of baseline risk, the observed incidences of the primary outcome among patients in the 25 × 109/L-intervention arm were lower than the observed incidences in the 50 × 109/L-intervention arm. Figure 3B shows that, in all 4 quartiles of baseline risk, the OR comparing patients in the 50 × 109/L-intervention arm with those in the 25 × 109/L arm were similar to the OR of the total study population (OR, 1.57). Figure 3C shows the corresponding absolute-risk differences in the 4 quartiles of baseline risk and a horizontal line indicating the overall trial result (7% risk difference). The absolute-risk differences indicate a protective effect of the 25 × 109/L threshold in all 4 quartiles of baseline risk, varying from an absolute-risk difference of 4.9% in the lowest quartile to 12.3% in the highest quartile of baseline risk. These values correspond with a number needed to treat of 21 in the lowest and 8 in the highest quartile of baseline risk.
Discussion
We wanted to identify groups of neonates who experienced more or less benefit from the low transfusion threshold in the PlaNet-2 trial. To investigate, we assessed the heterogeneity of treatment effect due to variations in baseline risk, using an internally validated baseline risk-prediction model. Our results suggest that all neonates experienced benefit from the low threshold, as it was associated with absolute-risk reduction in all risk groups. However, the absolute benefit varied considerably, from 4.9% in the lowest to 12.3% in the highest risk group.
These findings indicate that neonates with high predicted baseline risks are as vulnerable for harm associated with a higher transfusion threshold as neonates with low predicted baseline risks. These results appear contradictory to recommendations found in some guidelines that suggest using platelet transfusion thresholds above 25 × 109/L for neonates with suspected higher baseline risks.17-19 For example, these guidelines suggest using thresholds higher than 25 × 109/L for sick neonates with lower gestational age and/or birthweight. Clinicians who may have been reluctant to implement the results of PlaNet-2 in their smallest and sickest neonates can now be more confident that even this population is likely to benefit from using a lower platelet transfusion threshold.
Gestational age, previous major bleeding, and treatment assignment were independent predictors of outcome. Gestational age has been shown to predict bleeding and mortality in several prediction models (supplemental Table 1). Major bleeding before randomization (mainly intraventricular hemorrhage [IVH] and pulmonary hemorrhage) occurred in 122 of 660 neonates. Some of these neonates may have developed these bleeds during severe thrombocytopenia, as 39% of all neonates included in the PlaNeT-2 trial received platelet transfusions before randomization. This complicates accurate interpretation of these data, and further studies are needed to confirm previous major bleeding as a predictor for a second major bleed and/or death after onset of severe thrombocytopenia. Treatment assignment also predicted outcome, which was expected, given the overall trial results. The remaining variables were not shown to be independent predictors in our model, but this may be caused, in part, by lack of power (eg, NEC). In addition, variables such as postnatal age are thought to be important predictors for IVH, but as the incidence of IVH was low in the trial, their effect on the (composite) outcome may have been limited. Postnatal age was found to be an independent predictor of major bleeding in a recently published observational cohort study in which the incidence of IVH was higher.4
The strengths of our study are the randomized design of the main trial, the overall high levels of completeness of data for the primary outcome, predefined selection of variables to be included in the model, and agreement on an analysis plan before starting the analyses. In addition, our risk-based analysis is superior to conventional subgroup analyses, as it focuses on absolute-risk reduction, which has more clinical relevance than interactions on the relative scale, which are assessed in conventional subgroup analyses. In addition, conventional subgroup analyses are often severely underpowered, because they require multiple testing and many fail to meet best practices for subgroup testing. Although the method used in this study was also underpowered (as the trial was powered only for the primary analysis), it had better power than a conventional subgroup analysis would have had, because only 1 variable (baseline risk) was compared between the study arms. Last, the current method allows for calculation of absolute-risk reductions, given a combination of multiple clinical characteristics, which is not possible with conventional subgroup analyses.20-28
Various limitations of our study should be considered. First, our sample size did not allow inclusion of interaction terms in our model, and no prior evidence of strong interaction effects was available. Clinically relevant interactions are another potential source of heterogeneity of treatment effect in addition to baseline risk variation, and we therefore may not have identified all heterogeneity of treatment effect that was present in the trial. A model with interaction terms can also be used to predict individualized treatment effect, which is the ultimate goal of personalized medicine.14 Further studies are needed to assess this. Second, the C-statistic of our model indicates moderate discrimination, despite having selected variables that are generally considered to be important predictors of major bleeding and/or mortality, perhaps because there are important risk factors that we have not included in our model. For example, mechanical ventilation was shown to be a good predictor of major bleeding in a recently published dynamic prediction model,4 but ventilation data were not collected in the PlaNet-2 study. Another explanation is that some risk factors apply mainly to IVH, and as its incidence was low, they did not perform well in our dataset.3 It is also possible that baseline prediction models underperform because risk of outcome changes substantially as a result of clinical events that occur after baseline. This hypothesis is supported by the performance of the previously mentioned dynamic prediction model for major bleeding in preterm neonates, which had a median C-statistic of 0.74 (interquartile range, 0.69-0.82). With this model, risk of major bleeding within the subsequent 3 days could be predicted at any point in time during the first week after the onset of severe thrombocytopenia.4 Nevertheless, a C-statistic of 0.63 allows for some degree of risk stratification, as it is higher than 0.5 (which equals chance). This is illustrated in Figure 3. Last, the model has to be externally validated before it can be implemented in future studies or clinical settings.
To summarize, in our study, the 25 × 109/L threshold was beneficial compared with the 50 × 109/L threshold in all subgroups of predicted baseline risk, although absolute benefit seemed to vary considerably. These findings suggest consideration of a 25 × 109/L transfusion threshold in all preterm neonates, including those with high predicted risk of major bleeding and/or mortality. Future studies are needed, because our model had moderate discriminative capacity, did not include treatment interaction terms, and must be externally validated. Ultimately, an improved and validated model will allow for further refined prediction of individualized treatment effects for platelet transfusion in preterm neonates. This can be used to individualize our platelet transfusion guideline and potentially to improve outcomes for preterm neonates with severe thrombocytopenia.
Any requests for deidentified trial data and supporting material (data dictionary, protocol, and statistical analysis plan) will be reviewed by the trial-management group. Requests that have a methodologically sound proposal and whose proposed use of the data has been approved by the trial's independent steering committee will be considered. Proposals should be directed to A.C. in the first instance ([email protected]); to gain access, data requestors will have to sign a data access agreement.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Acknowledgments
The authors thank Camila Caram-Deelder for help with the statistical analyses.
The main trial was supported by the National Health Service Blood and Transplant Research and Development Committee (Ref. No.: BS06/1); Sanquin Research (Grant PPOC-12-012027); Addenbrooke’s Charitable Trust; and the Neonatal Breath of Life Fund 9145, none of which had a role in the conduct of this analysis.
The corresponding author had full access to all of the data and assumed the final responsibility for submitting the manuscript for publication.
S.F.F.G. is a PhD candidate at the University of Amsterdam, and this work is submitted in partial fulfillment of the requirement for the PhD.
Authorship
Contribution: S.F.F.-G., K.F., D.v.K., S.J.S., A.C., W.O., E.W.S., E.L., and J.G.v.d.B. designed the study; S.F.F.-G. prepared the data for analysis; D.v.K. analyzed the data; S.F.F.-G. and D.v.K. interpreted the data; S.F.F.-G., K.F., D.v.K., S.J.S., A.C., W.O., E.W.S., E.L., and J.G.v.d.B. wrote the report; all authors revised and approved the final report; and S.F.F.-G., K.F., S.J.S., A.C., W.O., E.L., E.d.K., E.J.d.H., C.V.H., E.J.H., W.P.d.B., and the PlaNeT-2 MATISSE collaborators were involved in the PlaNeT-2 trial.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
A complete list of the PlaNeT-2 MATISSE Collaborators appears in the supplemental Appendix.
Correspondence: Johanna G. van der Bom, Center for Clinical Transfusion Research, Sanquin/LUMC, Plesmanlaan 1A, 2333 BZ Leiden, The Netherlands; e-mail: [email protected].
REFERENCES
Author notes
The online version of this article contains a data supplement.