Applying a Text-Search Algorithm to Radiology Reports Can Find More Patients With Pulmonary Nodules Than Radiology Coding Alone
Introduction: Chest imaging often incidentally finds indeterminate nodules that need to be monitored to ensure early detection of lung cancers. Health care systems need effective approaches for identifying these lung nodules. We compared the diagnostic performance of 2 approaches for identifying patients with lung nodules on imaging studies (chest/abdomen): (1) relying on radiologists to code imaging studies with lung nodules; and (2) applying a text search algorithm to identify references to lung nodules in radiology reports.
Methods: We assessed all radiology studies performed between January 1, 2016 and November 30, 2016 in a single Veterans Health Administration hospital. We first identified imaging reports with a diagnostic code for a pulmonary nodule. We then applied a text search algorithm to identify imaging reports with key words associated with lung nodules. We reviewed medical records for all patients with a suspicious radiology report based on either search strategy to confirm the presence of a lung nodule. We calculated the yield and the positive predictive value (PPV) of each search strategy for finding pulmonary nodules.
Results: We identified 12,983 imaging studies with a potential lung nodule. Chart review confirmed 8,516 imaging studies with lung nodules, representing 2,912 unique patients. The text search algorithm identified all the patients with lung nodules identified by the radiology coding (n = 1,251) as well as an additional 1,661 patients. The PPV of the text search was 72% (2,912/4,071) and the PPV of the radiology code was 92% (1,251/1,363). Among the patients with nodules missed by radiology coding but identified by the text search algorithm, 130 had lung nodules > 8 mm in diameter.
Conclusions: The text search algorithm can identify additional patients with lung nodules compared to the radiology coding; however, this strategy requires substantial clinical review time to confirm nodules. Health care systems adopting nodule-tracking approaches should recognize that relying only on radiology coding might miss clinically important nodules.
Results
We identified 12,983 radiology studies that required manual review during the study period. We confirmed that 8,516 imaging studies had lung nodules, representing 2,912 patients. Subjects with lung nodules were predominantly male (96%), aged between 60 and 79 years (71%), and lived in a rural area (72%). More than 50% of these patients had COPD and over a third were current smokers (Table 1). The text search algorithm identified all of the patients identified by the radiology diagnostic code (n = 1,251). It also identified an additional 1,661 patients with lung nodules that otherwise would have been missed by the radiology code. Compared with those identified only by the text search, those identified by both the radiology coding and text search were older, had lower Charlson comorbidity scores, and were more likely to be a current smoker.
The text search algorithm identified more than twice as many patients with potential lung nodules compared with the radiology diagnostic code (4,071 vs 1,363) (Table 2). However, the text search algorithm was associated with a much higher number of false positives than was the diagnostic code (1,159 vs 112) and a lower PPV (72% [95% CI, 70.6-73.4] vs 92% [95% CI, 90.6-93.4], respectively). The text search algorithm identified 130 patients with lung nodules of moderate to high risk for malignancy (> 8 mm diameter) that were not identified by the radiology code. When the PPV of each search strategy was calculated based on imaging studies with nodules (most patients had > 1 imaging study), the results remained similar (98% for radiology code and 66% for text search). A larger proportion of the lung nodules detected by code 44 vs the text search algorithm were from CT chest studies.
Discussion
In a population of predominantly older male veterans with significant risk factors for lung cancer and high incidence of incidental lung nodules, applying a text search algorithm on radiology reports identified a substantial number of patients with lung nodules, including some with nodules > 8 mm, that were missed by the radiologist-generated code.9,10 Improving the yield of detection for lung nodules in a population with high risk for lung cancer would increase the likelihood of detecting patients with potentially curable early-stage lung cancers, decreasing lung cancer mortality.
The reasons for the high number of patients with lung nodules missed by the radiology code are unclear. Potential explanations may include the lack of standardization of imaging reports by the radiologists (ie, only 21% of chest CTs used a standardized template describing a lung nodule in our study), a problem well recognized both within and outside VHA.8,12
The text search algorithm identified more patients with lung nodules but had a higher rate of false positives when compared with the diagnostic code. The high rate of false positives resulted in more charts to review and an increased workload for the lung nodule registry team. The challenges presented by an increased workload should be balanced against the potential harms of missing nodules that develop into advanced cancer.