Artificial Intelligence: Review of Current and Future Applications in Medicine
Background: The role of artificial intelligence (AI) in health care is expanding rapidly. Currently, there are at least 29 US Food and Drug Administration-approved AI health care devices that apply to numerous medical specialties and many more are in development.
Observations: With increasing expectations for all health care sectors to deliver timely, fiscally-responsible, high-quality health care, AI has potential utility in numerous areas, such as image analysis, improved workflow and efficiency, public health, and epidemiology, to aid in processing large volumes of patient and medical data. In this review, we describe basic terminology, principles, and general AI applications relating to health care. We then discuss current and future applications for a variety of medical specialties. Finally, we discuss the future potential of AI along with the potential risks and limitations of current AI technology. Conclusions: AI can improve diagnostic accuracy, increase patient safety, assist with patient triage, monitor disease progression, and assist with treatment decisions.
AI Risks and Limitations
AI has several risks and limitations. Although there is progress in explainable AI, at times we still struggle to understand how the output provided by machine learning algorithms was created.44,48 The many layers associated with deep learning self-determine the criteria to reach its conclusion, and these criteria can continually evolve. The parameters of deep learning are not preprogrammed, and there are too many individual data points to be extrapolated or deconvoluted for evaluation at our current level of knowledge.26,51 These apparent lack of constraints cause concern for patient safety and suggest that greater validation and continued scrutiny of validity is required.8,48 Efforts are underway to create explainable AI programs to make their processes more transparent, but such clarification is limited presently.14,26,48,77
Another challenge of AI is determining the amount of training data required to function optimally. Also, if the output describes multiple variables or diagnoses, are each equally valid?113 Furthermore, many AI applications look for a specific process, such as cancer diagnoses on CXRs. However, how coexisting conditions like cardiomegaly, emphysema, pneumonia, etc, seen on CXRs will affect the diagnosis needs to be considered.51,52 Zech and colleagues provide the example that diagnoses for pneumothorax are frequently rendered on CXRs with chest tubes in place.51 They suggest that CNNs may develop a bias toward diagnosing pneumothorax when chest tubes are present. Many current studies approach an issue in isolation, a situation not realistic in real-world clinical practice.26
Most studies on AI have been retrospective, and frequently data used to train the program are preselected.13,26 The data are typically validated on available databases rather than actual patients in the clinical setting, limiting confidence in the validity of the AI output when applied to real-world situations. Currently, fewer than 12 prospective trials had been published comparing AI with traditional clinical care.13,114 Randomized prospective clinical trials are even fewer, with none currently reported from the United States.13,114 The results from several studies have been shown to diminish when repeated prospectively.114
The FDA has created a new category known as Software as a Medical Device and has a Digital Health Innovation Action Plan to regulate AI platforms. Still, the process of AI regulation is of necessity different from traditional approval processes and is continually evolving.8 The FDA approval process cannot account for the fact that the program’s parameters may continually evolve or adapt.2
Guidelines for investigating and reporting AI research with its unique attributes are being developed. Examples include the TRIPOD-ML statement and others.49,115 In September 2020, 2 publications addressed the paucity of gold-standard randomized clinical trials in clinical AI applications.116,117 The SPIRIT-AI statement expands on the original SPIRIT statement published in 2013 to guide minimal reporting standards for AI clinical trial protocols to promote transparency of design and methodology.116 Similarly, the CONSORT-AI extension, stemming from the original CONSORT statement in 1996, aims to ensure quality reporting of randomized controlled trials in AI.117
Another risk with AI is that while an individual physician making a mistake may adversely affect 1 patient, a single mistake in an AI algorithm could potentially affect thousands of patients.48 Also, AI programs developed for patient populations at a facility may not translate to another. Referred to as overfitting, this phenomenon relates to selection bias in training data sets.15,34,49,51,52 Studies have shown that programs that underrepresent certain group characteristics such as age, sex, or race may be less effective when applied to a population in which these characteristics have differing representations.8,48,49 This problem of underrepresentation has been demonstrated in programs interpreting pathology slides, radiographs, and skin lesions.15,32,51
Admittedly, most of these challenges are not specific to AI and existed in health care previously. Physicians make mistakes, treatments are sometimes used without adequate prospective studies, and medications are given without understanding their mechanism of action, much like AI-facilitated processes reach a conclusion that cannot be fully explained.48
Conclusions
The view that AI will dramatically impact health care in the coming years will likely prove true. However, much work is needed, especially because of the paucity of prospective clinical trials as has been historically required in medical research. Any concern that AI will replace HCPs seems unwarranted. Early studies suggest that even AI programs that appear to exceed human interpretation perform best when working in cooperation with and oversight from clinicians. AI’s greatest potential appears to be its ability to augment care from health professionals, improving efficiency and accuracy, and should be anticipated with enthusiasm as the field moves forward at an exponential rate.
Acknowledgments
The authors thank Makenna G. Thomas for proofreading and review of the manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital. This research has been approved by the James A. Haley Veteran’s Hospital Office of Communications and Media.