Sickle Cell Disease (SCD) is one of the most common genetic disease worldwide. The Acute Chest Syndrome (ACS) is a leading cause of death for SCD patients. The PRESEV1 study was set to produce a predictive score to assess the risk of an ACS development (Bartolucci et al., 2016). PRESEV2 was an international, multicenter prospective confirmatory study to validate the PRESEV score. This study aims at improving these predictions with the addition of a machine learning (ML) method.
Patients and methods:
Included patients follow PRESEV1 and PRESEV2 studies 'rules. The dataset thus contains 97 patients who developed an ACS episode (18.3%) against 434 patients who did not (81.7%).
To compute the PRESEV score, we firstly used the method developed previously with the following variables as input: leukocytes, reticulocytes, hemoglobin levels and cervical spine pain. This method is based on a decision tree with fixed rules and is referred to as the decision tree method throughout this abstract. Secondly we used a ML method using a combined sampling method named SMOTEENN to balance the data and a C-Support Vector Classification (SVC) with fixed parameters to predict the score. This method produces a probability, with a threshold of 0.2, under which the patient is predicted to declare an ACS. We considered the dataset composed of PRESEV1 dataset and 80 percent of PRESEV2 with a randomly choice. The test dataset is thus composed of the remaining 20 percent of PRESEV2. This technique of random choice allowed us to use a 50-cross-validation and compute with Python an average score and a standard deviation (std). In order to allow comparison of the developed score with or without the addition of the ML method, rates were calculated by adding the weight of ACS representation in the dataset.
Among all parameters analyzed, the SVC method considered the following variables for calculation of the score: leukocytes, LDH, urea, reticulocytes and hemoglobin levels. A hundred and two adult patients with a severe VOC requiring hospitalization were included. Out of this pool of patients, 26 (25.5%) were predicted with a low risk of developing an ACS episode (SVC method). Sensibility and specificity were of 94.7% and 26.8%, respectfully. The negative predictive value (NPV) was of 95.8% and the positive predictive value (PPV) of 22.4%. Results are resumed in table 1. When compared to the PRESEV score (decision tree method), 44 patients out of 372 were identified with a low risk score (11.8%),
Discussion and Conclusion:
While the addition of a ML method did not allow the improvement of the sensibility or the NPV of the PRESEV score, it improved both the specificity and the PPV. The addition of artificial intelligence thus provides a better prediction with a higher percentage of "low-risk" patients. As highlighted in the international PRESEV study, this score could represent a useful tool for physicians in hospital settings, with limited beds. While the PRESEV score could allow a better management of "low risk" patients on one side, the identification of "high-risk" patients could also represent a serious advantage to physicians, as it could improve the feasibility of clinical trials for the prevention of this lethal complication in SCD patients.
Bartolucci:Innovhem: Other; Novartis: Research Funding; Roche: Consultancy; Bluebird: Consultancy; Emmaus: Consultancy; Bluebird: Research Funding; Addmedica: Research Funding; AGIOS: Consultancy; Fabre Foundation: Research Funding; Novartis: Consultancy; ADDMEDICA: Consultancy; HEMANEXT: Consultancy; GBT: Consultancy.
Asterisk with author names denotes non-ASH members.