Algorithms for Prediction of Clinical Deterioration on the General Wards: A Scoping Review

Journal of Hospital Medicine 16(10). 2021 October;612-619. Published Online First June 25, 2021 | 10.12788/jhm.3630

June 23, 2021|Journal of Hospital Medicine

Roel V Peelen, MD;Yassin Eddahchouri, MD;Mats Koeneman, MSc;Tom H van de Belt, MSc, PhD;Harry van Goor, MD, PhD;Sebastian JH Bredie, MD, PhD

OBJECTIVE: The primary objective of this scoping review was to identify and describe state-of-the-art models that use vital sign monitoring to predict clinical deterioration on the general ward. The secondary objective was to identify facilitators, barriers, and effects of implementing these models.

DATA SOURCES: PubMed, Embase, and CINAHL databases until November 2020.

STUDY SELECTION: We selected studies that compared vital signs–based automated real-time predictive algorithms to current track-and-trace protocols in regard to the outcome of clinical deterioration in a general ward population.

DATA EXTRACTION: Study characteristics, predictive characteristics and barriers, facilitators, and effects.

RESULTS: We identified 1741 publications, 21 of which were included in our review. Two of the these were clinical trials, 2 were prospective observational studies, and the remaining 17 were retrospective studies. All of the studies focused on hospitalized adult patients. The reported area under the receiver operating characteristic curves ranged between 0.65 and 0.95 for the outcome of clinical deterioration. Positive predictive value and sensitivity ranged between 0.223 and 0.773 and from 7.2% to 84.0%, respectively. Input variables differed widely, and predicted endpoints were inconsistently defined. We identified 57 facilitators and 48 barriers to the implementation of these models. We found 68 reported effects, 57 of which were positive.

CONCLUSION: Predictive algorithms can detect clinical deterioration on the general ward earlier and more accurately than conventional protocols, which in one recent study led to lower mortality. Consensus is needed on input variables, predictive time horizons, and definitions of endpoints to better facilitate comparative research.

MATERIALS AND METHODS

We performed a scoping review to create a summary of the current state of research. We used the five-step method of Levac and followed the Preferred Reporting Items for Systematic reviews and Meta-Analyses Extension for Scoping Reviews guidelines (Appendix 1).^16,17

PubMed, Embase, and CINAHL databases were searched for English-language articles written between January 1, 2010, and November 20, 2020. We developed the search queries with an experienced information scientist, and we used database-specific terms and strategies for input, clinical outcome, method, predictive capability, and population (Appendix 2). Additionally, we searched the references of the selected articles, as well as publications citing these articles.

All studies identified were screened by title and abstract by two researchers (RP and YE). The selected studies were read in their entirety and checked for eligibility using the following inclusion criteria: automated algorithm; vital signs-based; real-time prediction; of clinical deterioration; in an adult, general ward population. In cases where there were successive publications with the same algorithm and population, we selected the most recent study.

For screening and selection, we used the Rayyan QCRI online tool (Qatar Computing Research Institute) and Endnote X9 (Clarivate Analytics). We extracted information using a data extraction form and organized it into descriptive characteristics of the selected studies (Table 1): an input data table showing number of admissions, intermittent or continuous measurements, vital signs measured, laboratory results (Appendix Table 1), a table summarizing study designs and settings (Appendix Table 2), and a prediction performance table (Table 2). We report characteristics of the populations and algorithms, prediction specifications such as area under the receiver operating curve (AUROC), and predictive values. Predictive values are affected by prevalence, which may differ among populations. To compare the algorithms, we calculated an indexed positive predictive value (PPV) and a number needed to evaluate (NNE) using a weighted average prevalence of clinical deterioration of 3.0%.

We defined clinical deterioration as end points, including rapid response team activation, cardiopulmonary resuscitation, transfer to an ICU, or death. Real-time was defined by the ability to automatically update predictions as new measurements are added. Predictions were defined as data-derived warnings for events in the near future. Prediction horizon was defined as the period for which a prediction is made. Special interest was given to algorithms that involved AI, which we defined as any form of machine learning or other nonclassical statistical algorithm.

Effects, facilitators, and barriers were identified and categorized using ATLAS.ti 8 software (ATLAS.ti) and evaluated by three researchers (RP, MK, and THvdB). These were categorized using the adapted frameworks of Gagnon et al¹⁸ for the barriers and facilitators and of Donabedian¹⁹ for the effects (Appendix 3).

The Gagnon et al framework was adapted by changing two of four domains—that is, “Individual” was changed to “Professional” and “Human” to “Physiology.” The domains of “Technology” and “Organization” remained unchanged. The Donabedian domains of “Outcome,” “Process,” and “Structure” also remained unchanged (Table 3).

We divided the studies into two groups: studies on predictive algorithms with and without AI when reporting on characteristics and performance. For the secondary aim of exploring implementation impact, we reported facilitators and barriers in a narrative way, highlighting the most frequent and notable findings.