Introduction: Multiple myeloma (MM) is a hematologic malignancy characterized by the clonal proliferation of plasma cells, leading to systemic effects on immune function, hematopoiesis, renal function, bone metabolism, and electrolyte homeostasis. Although clinical outcomes for MM have significantly improved due to advancements in therapeutic strategies, the influence of race on disease biology and treatment response remains an intriguing and critical area of investigation. Racial disparities in healthcare outcomes have been well-documented, making the understanding of these disparities in the context of MM essential for optimizing patient care and treatment strategies. Decisions regarding MM treatment often need to be made within very short time frames, sometimes within less than a week. Therefore, when developing predictive algorithms, it is crucial to thoroughly investigate potential biases related to race and ethnicity to ensure equitable and effective decision-making. This study aims to employ a machine learning approach to evaluate the systemic impact of monoclonal protein (M-protein) and explore potential racial differences between African American and Caucasian patients, with the goal of improving prediction models for timely and unbiased treatment decisions.

Methods: Thirteen clinical and laboratory variables were selected as predictors of M-spike values and were fed into two machine-learning models using the random forest algorithm. Model A included the variables ‘M_spike_last’, ‘M_spike_second_to_last’, ‘Serum_total_protein’, ‘race’, and ‘ethnicity’. Model B, included Model A variables with the addition of ‘Serum_Ig_A’, ‘Serum_Ig_G’, ‘Serum_Ig_M’, ‘height’, ‘weight’, ‘Serum_albumin’, and ‘Ratio_of_free_k_chain_to_free_l_chain’. The upper limit of observed M-spike was 3.5 g/dL.

The dataset was randomly divided into a training set (80%) and a test set (20%). A regression tree was built using the training set and validated with the test set. The importance of each variable was assessed by excluding it from the model and evaluating the impact on the root mean squared error (RMSE). To increase the % of patients who were not Non-Hispanic Whites in the sample, half of the Non-Hispanic White were randomly culled from the overall data set and re-run in Model A (A2).

Results: The study utilized a total of 619 patient-based observations, divided into a training set of 495 observations and a validation set of 124 patients. Of the patients that had race and ethnicity fields available (n=542), the training datasets included 476 (90%) Non-Hispanic White, 33 (6%) African American, 14 (3%) Hispanic or Latino/a, and 20 (4%) from other races. This changed for the culled data set to 78% Non-Hispanic White, 11% African American, 5% Hispanic or Latino/a, 7% other.

When race was excluded, Model A achieved a root mean squared error (RMSE) of 0.2634 and an R² of 0.7440. Including race slightly improved the performance, with an RMSE of 0.2631 and an R² of 0.7445. A2 analysis resulted on RMSE = 0.3223 and a R² = 0.5926 excluding race, and RMSE = 0.3233 with R² = 0.5907 including it.

When race was excluded, Model B demonstrated an RMSE of 0.2524 and an R² of 0.7670. Interestingly, including race resulted in a slight decrease in predictive performance, with an RMSE of 0.2555 and an R² of 0.7604.

Conclusion: Our machine learning analysis supports the null hypothesis that race and ethnicity do not provide additional significative predictive power compared to the other predictors of the model. The differences in RMSE and R² values when race and ethnicity were included versus excluded in both models were minimal. This study is limited by 10% of patients being other than Non-Hispanic Whites. However, this limitation is likely minimal as a similar finding in the lack of model predictive difference were observed when this group was increased to 22%. This suggests that race and ethnicity likely do not substantially enhance the predictive power of the models in the context of M-spike value prediction in multiple myeloma patients. As this analysis provides assurance that potential biological influences of race and ethnicity are minimal and likely inconsequential for the prediction of M-protein, this approach, which may afford patients 3-7 additional days to discuss potential treatment options for a relapse with their care team, could be employed across different races and ethnicities with minimal concern.

Disclosures

Ahlstrom:Pfizer: Other: Patient advocacy; BMS: Other: Patient advocacy; Takeda Oncology: Other: Patient advocacy; Sanofi: Other: Patient advocacy; Janssen: Other: Patient advocacy. Hydren:Johnson and Johnson Innovative Medicine: Research Funding; Regeneron: Research Funding; GlaxoSmithKline: Research Funding; Sanofi: Research Funding; BioLinRx: Research Funding; Adaptive Biotechnologies: Research Funding; Pfizer: Research Funding; Takeda Oncology: Research Funding. Malek:BMS: Consultancy; Adaptive Bio: Consultancy; medpacto: Research Funding; janssen: Consultancy, Speakers Bureau.

This content is only available as a PDF.
Sign in via your Institution