Background Hospital discharge data have been used to estimate the burden of venous thromboembolism (VTE) disease. However, most of these databases are de-identified which limits their utility for estimating VTE incidence due to the inability to identify multiple hospitalizations for the same VTE event, and the inability to differentiate between first-time and recurrent VTE events.

Objective We aimed to estimate the magnitude of error in estimates of VTE incidence derived from hospital discharge data by comparing the results obtained when patient identifying information is included, thus enabling us to remove duplicate patient events and stratify by first-time and recurrent VTE events, to the estimates obtained using only de-identified data.

Methods In collaboration with the Centers for Disease Control and Prevention (CDC) and the Oklahoma State Department of Health (OSDH), we established a pilot surveillance system for VTE events in Oklahoma County, OK during 2012–2014. The OSHD Commissioner of Health made VTE events reportable conditions from 2010 to 2015 which facilitated our acquisition of hospital discharge data with patient identifiers for years 2010–2012 from the OSDH. The data included the inpatient, outpatient surgical, and ambulatory surgery center discharges. A deep vein thrombosis diagnosis was defined as the presence of any of the ICD-9-CM codes 451.1x, 451.81, 451.83, 453.2, 453.4x, 671.3x, and 671.4x. A pulmonary embolism diagnosis was defined as the presence of either of the ICD-9-CM codes 415.1x and 673.2x. Data were de-duplicated and linked across datasets using Link Plus software incorporating patient identifying variables. Duplicate events for the same person caused by hospital transfers were defined a priori as a second hospital admission with a VTE diagnosis code occurring within 72 hours of the previous discharge date with a VTE present on admission (POA) code for the second admission of “Yes” or “Unknown.” Potentially recurrent events were defined as two hospital admissions of the same patient ≥72 hours apart with a VTE diagnosis. Census Bureau estimates for 2010–2012 were used to define the population at risk in Oklahoma County. Incidence rates (IR) and 95% confidence intervals (CI) were calculated using the Poisson distribution and reported as events per 100,000 population per year. Rate differences and excess fractions were calculated to account for the contribution of recurrent and duplicate events to overall estimates and to differentiate between event-based incidence estimates and patient-based estimates.

Results We identified 3,299 unique patients with VTE events. The overall event-based IR for VTE events was 249 (95% CI: 241–257). The IR for potentially recurrent events was 35 (95% CI: 32–38) and for duplicate events caused by patient transfers was 13 (95% CI: 11–14). Thus, the rate difference between event-based estimates and patient-based estimates was 48 (95%CI: 44–51) giving a patient-based IR for first-time events of 201 (95% CI: 194–208). The excess fraction was 19.2% (95% CI: 17.8%-20.5%), of which 14.1% (95% CI: 12.9%–15.2%) is attributed to potential recurrent events and 5.1% (95% CI: 4.4%–5.8%) is attributed to duplicate events caused by patient transfers.

Conclusions Using event-based estimates for VTE disease resulted in an over-estimate of the incidence rate of first-time VTE events by up to 20%. Included in this excess estimate is the burden caused by potential recurrent events (14%) and duplicate events caused by patient transfers (5%). We designed our case definitions to accurately measure first-time events, and to capture all duplicate events and potential recurrent events. Assuming these data are representative of national trends, applying these excess fractions to estimates from de-identified data may improve the validity of measuring the incidence of first-time VTE events from de-identified hospital discharge data.


Bratzler:Centers for Disease Control and Prevention: Consultancy; Sanofi Pasteur: Consultancy. Raskob:Bayer Healthcare: Consultancy, Honoraria; BMS: Consultancy, Honoraria; Daiichi Sankyo: Consultancy, Honoraria; Janssen Pharmaceuticals: Consultancy, Honoraria; Pfizer: Consultancy, Honoraria; ISIS Pharmaceuticals: Consultancy, Honoraria.

Author notes


Asterisk with author names denotes non-ASH members.