Focusing on Inattention: The Diagnostic Accuracy of Brief Measures of Inattention for Detecting Delirium
BACKGROUND: Delirium is frequently missed in most clinical settings. Brief delirium assessments are needed.
OBJECTIVE: To determine the diagnostic accuracy of reciting the months of year backwards (MOTYB) from December to July (MOTYB-6) and December to January (MOTYB-12) for delirium as diagnosed by a psychiatrist and to explore the diagnostic accuracies of the following other brief attention tasks: (1) spell the word “LUNCH” backwards, (2) recite the days of the week backwards, (3) 10-letter vigilance “A” task, and (4) 5 picture recognition task.
DESIGN: Preplanned secondary analysis of a prospective observational study.
SETTING: Emergency department located within an academic, tertiary care hospital.
PARTICIPANTS: 234 acutely ill patients who were ≥65 years old.
MEASUREMENTS: The inattention tasks were administered by a physician. The reference standard for delirium was a comprehensive psychiatrist assessment using Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision criteria. Sensitivities and specificities were calculated.
RESULTS: Making any error on the MOTYB-6 task had a sensitivity of 80.0% (95% confidence interval [CI], 60.9%-91.1%) and specificity of 57.1% (95% CI, 50.4%-63.7%). Making any error on the MOTYB-12 task had a sensitivity of 84.0% (95% CI, 65.4%-93.6%) and specificity of 51.9% (95% CI, 45.2%-58.5%). The best combination of sensitivity and specificity was reciting the days of the week backwards task; if the patient made any error, this was 84.0% (95% CI, 65.4%-93.6%) sensitive and 81.9% (95% CI, 76.1%-86.5%) specific.
CONCLUSION: MOTYB-6 and MOTYB-12 had very good sensitivities but had modest specificities for delirium, limiting their use as a standalone assessment. Reciting the days of the week backwards appeared to have the best combination of sensitivity and specificity for delirium.
© 2018 Society of Hospital Medicine
All statistical analyses were performed with open source R statistical software version 3.0.1 (https://www.r-project.org/), SAS 9.4 (SAS Institute, Cary, NC), and Microsoft Excel 2010 (Microsoft Inc., Redmond, WA).
RESULTS
DISCUSSION
Delirium is frequently missed by healthcare providers because it is not routinely screened for in the acute care setting. To help address this deficiency of care, we evaluated several brief measures of inattention that take less than 30 seconds to complete. We observed that any errors made on the MOTYB-6 and MOTYB-12 tasks had very good sensitivities (80% and 84%) but were limited by their modest specificities (approximately 50%) for delirium. As a result, these assessments have limited clinical utility as standalone delirium screens. We also explored other commonly used brief measures of inattention and at a variety of error cutoffs. Reciting the days of the week backwards appeared to best balance sensitivity and specificity. None of the inattention measures could convincingly rule out delirium (NLR < 0.10), but the vigilance “A” and picture recognition tasks may have clinical utility in ruling in delirium (PLR > 10). Overall, all the inattention tasks, including MOTYB-6 and MOTYB-12, had very good diagnostic performances based upon their AUC. However, achieving a high sensitivity often had to be sacrificed for specificity or, alternatively, achieving a high specificity had to be sacrificed for sensitivity.
Inattention has been shown to be the cardinal feature for delirium,40 and its assessment using cognitive testing has been recommended to help identify the presence of delirium according to an expert consensus panel.26 The diagnostic performance of the MOTYB-12 observed in our study is similar to a study by Fick et al., who reported that MOTYB-12 had very good sensitivity (83%) but had modest specificity (69%) with a cutoff of 1 or more errors. Hendry et al. observed that the MOTYB-12 was 91% sensitive and 50% specific using a cutoff of 4 or more errors. With regard to the MOTYB-6, our reported specificity was different from what was observed by O’Regan et al.27 Using 1 or more errors as a cutoff, they observed a much higher specificity for delirium than we did (90% vs 57%). Discordant observations regarding the diagnostic accuracy for other inattention tasks also exist. We observed that making any error on the days of the week backwards task was 84% sensitive and 82% specific for delirium, whereas Fick et al. observed a sensitivity and specificity of 50% and 94%, respectively. For the vigilance “A” task, we observed that making 2 or more errors over a series of 10 letters was 64.0% sensitive and 91.4% specific for delirium, whereas Pompei et al.41 observed that making 2 or more errors over a series of 60 letters was 51% sensitive and 77% specific for delirium.
The abovementioned discordant findings may be driven by spectrum bias, wherein the sensitivities and specificities for each inattention task may differ in different subgroups. As a result, differences in the age distribution, proportion of college graduates, history of dementia, and susceptibility to delirium can influence overall sensitivity and specificity. Objective measures of delirium, including the inattention screens studied, are particularly prone to spectrum bias.31,34 However, the strength of this approach is that the assessment of inattention becomes less reliant upon clinical judgment and allows it to be used by raters from a wide range of clinical backgrounds. On the other hand, a subjective interpretation of these inattention tasks may allow the rater to capture the subtleties of inattention (ie, decreased speed of performance in a highly intelligent and well-educated patient without dementia). The disadvantage of this approach, however, is that it is more dependent on clinical judgment and may have decreased diagnostic accuracy in those with less clinical experience or with limited training.14,42,43 These factors must be carefully considered when determining which delirium assessment to use.
Additional research is required to determine the clinical utility of these brief inattention assessments. These findings need to be further validated in larger studies, and the optimal cutoff of each task for different subgroup of patients (eg, demented vs nondemented) needs to be further clarified. It is not completely clear whether these inattention tests can serve as standalone assessments. Depending on the cutoff used, some of these assessments may have unacceptable false negative or false positive rates that may lead to increased adverse patient outcomes or increased resource utilization, respectively. Additional components or assessments may be needed to improve the diagnostic accuracy of these assessments. In addition to understanding these inattention assessments’ diagnostic accuracies, their ability to predict adverse outcomes also needs to be investigated. While a previous study observed that making any error on the MOTYB-12 task was associated with increased physical restraint use and prolonged hospital length of stay,44 these assessments’ ability to prognosticate long-term outcomes such as mortality or long-term cognition or function need to be studied. Lastly, studies should also evaluate how easily implementable these assessments are and whether improved delirium recognition leads to improved patient outcomes.
This study has several notable limitations. Though planned a priori, this was a secondary analysis of a larger investigation designed to validate 3 delirium assessments. Our sample size was also relatively small, causing our 95% CIs to overlap in most cases and limiting the statistical power to truly determine whether one measure is better than the other. We also asked the patient to recite the months backwards from December to July as well as recite the months backwards from December to January. It is possible that the patient may have performed better at going from December to January because of learning effect. Our reference standard for delirium was based upon DSM-IV-TR criteria. The new DSM-V criteria may be more restrictive and may slightly change the sensitivities and specificities of the inattention tasks. We enrolled a convenience sample and enrolled patients who were more likely to be male, have cardiovascular chief complaints, and be admitted to the hospital; as a result, selection bias may have been introduced. Lastly, this study was conducted in a single center and enrolled patients who were 65 years and older. Our findings may not be generalizable to other settings and in those who are less than 65 years of age.