Allogeneic hematopoietic cell transplantation (HCT) has been more widely applicable to the patients with hematologic malignancies owing to the increased donor availability, advances in conditioning regimen, prevention of transplantation-related toxicities, and general supportive care. However, there is no comprehensive and uniform approach for decision making which incorporates transplantation-related factors including patients and donor selection, conditioning intensity, or prevention of graft-versus-host disease (GVHD). In this regard, we aimed to establish and validate a machine learning-based predictive model for survival after allogeneic HCT in hematologic malignancies.


Data from 2,011 patients with hematologic malignancies (1,464 acute leukemia, 296 myelodysplastic syndrome, 100 chronic myeloid leukemia, 45 myeloproliferative neoplasm, 85 lymphoma, and 21 multiple myeloma) who received allogeneic HCT between December 1993 and December 2019 at the Asan Medical Center were retrospectively analyzed.


The median overall and event-free survival of total patients were 4.2 year (95% confidence interval [CI], 2.9-5.4) and 1.5 year (95% CI, 1.1-1.8), respectively. To predict post-transplantation survival, the patients were classified into "survived more than 5 years" and "died before 5 years". Among four major machine learning models (random forest [RF], support vector machine, logistic regression, and feed forward neural network), we selected RF method according to the predictive power of each algorithm. Using the RF machine learning algorithm, we developed a post-transplantation survival predicting model with the training cohort of 1,408 patients (70%) and tested it with the validation cohort of 603 patients (30%). Of >200 variables, 33 were selected using recursive feature elimination, and the estimated area under the receiver operator characteristic curve and accuracy of the model was 0.812 and 0.73, respectively. We then evaluated the robustness of predictive power of the model using 10-fold cross-validation in validation cohort. In addition, risk scores were calculated from each patient in the validation cohort, and there was an agreement between the estimated predicted risk and observed risk.


In conclusion, the machine learning-based prediction model seems feasible assuming post-transplantation survival outcomes in hematologic malignancies. Our findings could be helpful for clinicians to select more appropriate donor in terms of age or type of human leukocyte antigen mismatch, conditioning regimen, and GVHD prophylaxis.


Lee:Astellas: Membership on an entity's Board of Directors or advisory committees; AbbVie: Membership on an entity's Board of Directors or advisory committees; Celgene: Membership on an entity's Board of Directors or advisory committees; Janssen: Membership on an entity's Board of Directors or advisory committees; Novartis: Membership on an entity's Board of Directors or advisory committees. Lee:Hanmi: Research Funding.

Author notes


Asterisk with author names denotes non-ASH members.

Sign in via your Institution