Bleeding is an uncommon event but it is causes significant increase in morbidity and mortality. Identifying bleeding events using electronic health record data (both resulting from hospitalization and causing hospitalization) would allow the development of risk assessment models (RAM) to identify those at most risk. Traditional prospective cohorts for rare events are time consuming and expensive. We suggest a more efficient method using the electronic health record (EHR) data by developing and validating an algorithm to detect bleeding in hospitalized patients, ie, a "computable phenotype".


We captured all admissions to the University of Vermont (UVM) Medical Center between 2010-19, a tertiary care medical center in northwest Vermont. Using International Classification of Disease (ICD) 9 and 10 discharge diagnoses, "present on admission" flags, problem lists, laboratory values, vital signs, current procedure terminology (CPT) codes, medication administration, and flowsheet data for transfusion support, we developed computable phenotypes for bleeding. Classification was based on the gold standard International Society of Thrombosis and Haemostasis definitions for clinically relevant non-major bleeding (CRNMB) and major bleeding (MB) and validated by medical record review. To improve sensitivity and specificity, algorithms were developed by bleeding site (intracerebral, intraspinal, pericardial, retroperitoneal, orbital, intramuscular, gastrointestinal, genitourinary, gynecologic, pulmonary, nasal, post-procedure, or miscellaneous). We preliminary validated the computable phenotype by randomly abstracting 10 medical records from each bleeding site.


Among 62,468 admissions, our computable phenotype for bleeding identified 10,202 bleeding events associated with hospitalization; 4,650 were CRNMB and 5,552 were MB. On chart abstraction, 135 of 153 hospitalizations had either a MB or CRNMB (88%, Figure). For MB, 95 of 119 (80%) of the computed MB phenytope events were validated. Of the 24 of 119 (20%) not validated, 14% (16) were CRNMB and 7% (8) the bleeding was present on coding but was not detected by chart review. Only 29%(10/34) of the CRNMB were validated. The most common error in the CRNMB computable phenotype was misclassification of 14 MB as CRNMB (41% of CRNMB. For individual bleeding sites, (figure), the algorithms performed well for most sites including intracerebral hemorrhage, gastrointestinal, and intramuscular bleeding, but performed less well for unusual and rarer bleeding sites (i.e. nasal).


We developed a computable phenotype for bleeding which can be applied to our EHR system. The computable phenotype was specific for MB, but underestimated the severity of potential CRNMB. Importantly, we correctly classified specific important bleeding sites such as intracerebral, gastrointestinal, and retroperitoneal. This computable phenotype forms the basis for further refinement, and provides a road map for future studies on epidemiology of hospital-acquired bleeding and hospitalization for bleeding.

Figure: Major and Clinically relevant non-major bleeding as detected by Electronic Health Record compared to the chart validation


No relevant conflicts of interest to declare.

Author notes


Asterisk with author names denotes non-ASH members.