Predicting 30-day pneumonia readmissions using electronic health record data
BACKGROUND
Readmissions after hospitalization for pneumonia are common, but the few risk-prediction models have poor to modest predictive ability. Data routinely collected in the electronic health record (EHR) may improve prediction.
OBJECTIVE
To develop pneumonia-specific readmission risk-prediction models using EHR data from the first day and from the entire hospital stay (“full stay”).
DESIGN
Observational cohort study using stepwise-backward selection and cross-validation.
SUBJECTS
Consecutive pneumonia hospitalizations from 6 diverse hospitals in north Texas from 2009-2010.
MEASURES
All-cause nonelective 30-day readmissions, ascertained from 75 regional hospitals.
RESULTS
Of 1463 patients, 13.6% were readmitted. The first-day pneumonia-specific model included sociodemographic factors, prior hospitalizations, thrombocytosis, and a modified pneumonia severity index; the full-stay model included disposition status, vital sign instabilities on discharge, and an updated pneumonia severity index calculated using values from the day of discharge as additional predictors. The full-stay pneumonia-specific model outperformed the first-day model (C statistic 0.731 vs 0.695; P = 0.02; net reclassification index = 0.08). Compared to a validated multi-condition readmission model, the Centers for Medicare and Medicaid Services pneumonia model, and 2 commonly used pneumonia severity of illness scores, the full-stay pneumonia-specific model had better discrimination (C statistic range 0.604-0.681; P < 0.01 for all comparisons), predicted a broader range of risk, and better reclassified individuals by their true risk (net reclassification index range, 0.09-0.18).
CONCLUSIONS
EHR data collected from the entire hospitalization can accurately predict readmission risk among patients hospitalized for pneumonia. This approach outperforms a first-day pneumonia-specific model, the Centers for Medicare and Medicaid Services pneumonia model, and 2 commonly used pneumonia severity of illness scores. Journal of Hospital Medicine 2017;12:209-216. © 2017 Society of Hospital Medicine
© 2017 Society of Hospital Medicine
We conducted an observational study using EHR data collected from 6 hospitals (including safety net, community, teaching, and nonteaching hospitals) in north Texas between November 2009 and October 2010, All hospitals used the Epic EHR (Epic Systems Corporation, Verona, WI). Details of this cohort have been published.18,19
We included consecutive hospitalizations among adults 18 years and older discharged from any medicine service with principal discharge diagnoses of pneumonia (ICD-9-CM codes 480-483, 485, 486-487), sepsis (ICD-9-CM codes 038, 995.91, 995.92, 785.52), or respiratory failure (ICD-9-CM codes 518.81, 518.82, 518.84, 799.1) when the latter 2 were also accompanied by a secondary diagnosis of pneumonia.20 For individuals with multiple hospitalizations during the study period, we included only the first hospitalization. We excluded individuals who died during the index hospitalization or within 30 days of discharge, were transferred to another acute care facility, or left against medical advice.
Outcomes
The primary outcome was all-cause 30-day readmission, defined as a nonelective hospitalization within 30 days of discharge to any of 75 acute care hospitals within a 100-mile radius of Dallas, ascertained from an all-payer regional hospitalization database.
Predictor Variables for the Pneumonia-Specific Readmission Models
The selection of candidate predictors was informed by our validated multi-condition risk-prediction models using EHR data available within 24 hours of admission (‘first-day’ multi-condition EHR model) or during the entire hospitalization (‘full-stay’ multi-condition EHR model).18,19 For the pneumonia-specific models, we included all variables in our published multi-condition models as candidate predictors, including sociodemographics, prior utilization, Charlson Comorbidity Index, select laboratory and vital sign abnormalities, length of stay, hospital complications (eg, venous thromboembolism), vital sign instabilities, and disposition status (see Supplemental Table 1 for complete list of variables). We also assessed additional variables specific to pneumonia for inclusion that were: (1) available in the EHR of all participating hospitals; (2) routinely collected or available at the time of admission or discharge; and (3) plausible predictors of adverse outcomes based on literature and clinical expertise. These included select comorbidities (eg, psychiatric conditions, chronic lung disease, history of pneumonia),10,11,21,22 the pneumonia severity index (PSI),16,23,24 intensive care unit stay, and receipt of invasive or noninvasive ventilation. We used a modified PSI score because certain data elements were missing. The modified PSI (henceforth referred to as PSI) did not include nursing home residence and included diagnostic codes as proxies for the presence of pleural effusion (ICD-9-CM codes 510, 511.1, and 511.9) and altered mental status (ICD-9-CM codes 780.0X, 780.97, 293.0, 293.1, and 348.3X).
Statistical Analysis
Model Derivation. Candidate predictor variables were classified as available in the EHR within 24 hours of admission and/or at the time of discharge. For example, socioeconomic factors could be ascertained within the first day of hospitalization, whereas length of stay would not be available until the day of discharge. Predictors with missing values were assumed to be normal (less than 1% missing for each variable). Univariate relationships between readmission and each candidate predictor were assessed in the overall cohort using a pre-specified significance threshold of P ≤ 0.10. Significant variables were entered in the respective first-day and full-stay pneumonia-specific multivariable logistic regression models using stepwise-backward selection with a pre-specified significance threshold of P ≤ 0.05. In sensitivity analyses, we alternately derived our models using stepwise-forward selection, as well as stepwise-backward selection minimizing the Bayesian information criterion and Akaike information criterion separately. These alternate modeling strategies yielded identical predictors to our final models.
Model Validation. Model validation was performed using 5-fold cross-validation, with the overall cohort randomly divided into 5 equal-size subsets.25 For each cycle, 4 subsets were used for training to estimate model coefficients, and the fifth subset was used for validation. This cycle was repeated 5 times with each randomly-divided subset used once as the validation set. We repeated this entire process 50 times and averaged the C statistic estimates to derive an optimism-corrected C statistic. Model calibration was assessed qualitatively by comparing predicted to observed probabilities of readmission by quintiles of predicted risk, and with the Hosmer-Lemeshow goodness-of-fit test.
Comparison to Other Models. The main comparisons of the first-day and full-stay pneumonia-specific EHR model performance were to each other and the corresponding multi-condition EHR model.18,19 The multi-condition EHR models were separately derived and validated within the larger parent cohort from which this study cohort was derived, and outperformed the CMS all-cause model, the HOSPITAL model, and the LACE index.19 To further triangulate our findings, given the lack of other rigorously validated pneumonia-specific risk-prediction models for readmission,14 we compared the pneumonia-specific EHR models to the CMS pneumonia model derived from administrative claims data,10 and 2 commonly used risk-prediction scores for short-term mortality among patients with community-acquired pneumonia, the PSI and CURB-65 scores.16 Although derived and validated using patient-level data, the CMS model was developed to benchmark hospitals according to hospital-level readmission rates.10 The CURB-65 score in this study was also modified to include the same altered mental status diagnostic codes according to the modified PSI as a proxy for “confusion.” Both the PSI and CURB-65 scores were calculated using the most abnormal values within the first 24 hours of admission. The ‘updated’ PSI and the ‘updated’ CURB-65 were calculated using the most abnormal values within 24 hours prior to discharge, or the last known observation prior to discharge if no results were recorded within this time period. A complete list of variables for each of the comparison models are shown in Supplemental Table 1.
We assessed model performance by calculating the C statistic, integrated discrimination index, and net reclassification index (NRI) compared to our pneumonia-specific models. The integrated discrimination index is the difference in the mean predicted probability of readmission between patients who were and were not actually readmitted between 2 models, where more positive values suggest improvement in model performance compared to a reference model.26 The NRI is defined as the sum of the net proportions of correctly reclassified persons with and without the event of interest.27 Here, we calculated a category-based NRI to evaluate the performance of pneumonia-specific models in correctly classifying individuals with and without readmissions into the 2 highest readmission risk quintiles vs the lowest 3 risk quintiles compared to other models.27 This pre-specified cutoff is relevant for hospitals interested in identifying the highest risk individuals for targeted intervention.7 Finally, we assessed calibration of comparator models in our cohort by comparing predicted probability to observed probability of readmission by quintiles of risk for each model. We conducted all analyses using Stata 12.1 (StataCorp, College Station, Texas). This study was approved by the University of Texas Southwestern Medical Center Institutional Review Board.