ADVERTISEMENT

Applying a Text-Search Algorithm to Radiology Reports Can Find More Patients With Pulmonary Nodules Than Radiology Coding Alone

Federal Practitioner. 2020 May;37(2)s:S32-S37
Author and Disclosure Information

Introduction: Chest imaging often incidentally finds indeterminate nodules that need to be monitored to ensure early detection of lung cancers. Health care systems need effective approaches for identifying these lung nodules. We compared the diagnostic performance of 2 approaches for identifying patients with lung nodules on imaging studies (chest/abdomen): (1) relying on radiologists to code imaging studies with lung nodules; and (2) applying a text search algorithm to identify references to lung nodules in radiology reports.

Methods: We assessed all radiology studies performed between January 1, 2016 and November 30, 2016 in a single Veterans Health Administration hospital. We first identified imaging reports with a diagnostic code for a pulmonary nodule. We then applied a text search algorithm to identify imaging reports with key words associated with lung nodules. We reviewed medical records for all patients with a suspicious radiology report based on either search strategy to confirm the presence of a lung nodule. We calculated the yield and the positive predictive value (PPV) of each search strategy for finding pulmonary nodules.

Results: We identified 12,983 imaging studies with a potential lung nodule. Chart review confirmed 8,516 imaging studies with lung nodules, representing 2,912 unique patients. The text search algorithm identified all the patients with lung nodules identified by the radiology coding (n = 1,251) as well as an additional 1,661 patients. The PPV of the text search was 72% (2,912/4,071) and the PPV of the radiology code was 92% (1,251/1,363). Among the patients with nodules missed by radiology coding but identified by the text search algorithm, 130 had lung nodules > 8 mm in diameter.

Conclusions: The text search algorithm can identify additional patients with lung nodules compared to the radiology coding; however, this strategy requires substantial clinical review time to confirm nodules. Health care systems adopting nodule-tracking approaches should recognize that relying only on radiology coding might miss clinically important nodules.

Methods

Since 2014, The ICVAHCS has used a radiology diagnostic code to identify any imaging studies with lung nodules. The radiologist enters “44” at the end of the reading process using the Nuance Powerscribe 360 radiation reporting system. The code is uploaded into the VHA Corporate Data Warehouse (CDW), and it is located within the radiology exam domain. This strategy was created and implemented by the Minneapolis VA Health Care System in Minnesota for all the VA hospitals in VISN 23. A lung nodule registry nurse was provided with a list of radiology studies flagged with this radiology diagnostic code every 2 weeks. A chart review was then performed for all these studies to determine the presence of a lung nodule. When detected, the ordering health care provider was alerted and given recommendations for managing the nodule.

We initially searched for the radiology studies with a presumptive lung nodule using the radiology code 44 within the CDW. Separately, we applied the text search strategy only to radiology reports from chest and abdomen studies (ie, X-rays, CT, magnetic resonance imaging [MRI], and PET) that contained any of the keyword phrases. The text search strategy was modeled based on a natural language processing (NLP) algorithm developed by the Puget Sound VA Healthcare System in Seattle, Washington to identify lung nodules on radiology reports.9 Our algorithm included a series of text searches using Microsoft SQL. After several simulations using a random group of radiology reports, we chose the keywords: “lung AND nodul”; “pulm AND nodul”; “pulm AND mass”; “lung AND mass”; and “ground glass”. We selected only chest and abdomen studies because on several simulations using a random group of radiology reports, the vast majority of lung nodules were identified on chest and abdomen imaging studies. Also, it would not have been feasible to chart review the approximately 30,000 total radiology reports that were generated during the study period.

From January 1, 2016 through November 30, 2016, we applied both search strategies independently: radiology diagnostic code for lung nodules to all imaging studies, and text search to all radiology reports of chest and abdomen imaging studies in the CDW (Figure). We also collected demographic (eg, age, sex, race, rurality) and clinical (eg, medical comorbidities, tobacco use) information that were uploaded to the database automatically from CDW using International Statistical Classification of Diseases, Tenth Edition and demographic codes. The VHA uses the Rural-Urban Commuting Areas (RUCA) system to define rurality, which takes into account population density and how closely a community is linked socioeconomically to larger urban centers.11 The protocol was reviewed and approved by the institutional review board of ICVAHCS and the University of Iowa.

The presence of a lung nodule was established by having the lung nodule registry nurse manually review the charts of every patient with a radiology report identified by either code 44 or the text search algorithm. The goal was to ensure that our text search strategy identified all reports with a code 44 to be compliant with VISN expectations. Cases in which a lung nodule was described in the radiology report were considered true positives, and those without a lung nodule description were considered false positives.

We compared the sociodemographic and clinical characteristics of patients with lung nodules between those identified with both code 44 and the text search and those identified with the text search alone. We used χ2 tests for categorical variables (eg, age, gender, RUCA, chronic obstructive pulmonary disease (COPD), smoking status) and t tests for continuous variables (eg, Charlson comorbidity score). A P value ≤ .05 was considered statistically significant. To assess the yield of each search strategy, we determined the number of patients with lung nodules detected by the text search and the radiology diagnostic code. We also calculated the positive predictive value (PPV) and 95% CI of each search strategy.