ChatGPT, the large language Artificial Intelligence (AI) model, trained on 570 GB of internet data as well through reinforcement learning from human feedback, is finding a footing in healthcare. It’s already passed a US Medical Licensing Examination, co-wrote a peer-reviewed medical article, and has even written a letter to United Healthcare requesting prior authorization for an echocardiogram.
Despite these achievements, is ChatGPT ready for prime time in healthcare decision-making?
As a radiologist who has developed and deployed AI since 2016 in a specialty that boasts 396 of the 500 AI applications cleared by the FDA, I can unequivocally say there are multiple roles for ChatGPT in healthcare generally, and radiology specifically. But for ChatGPT to fulfill its promise as a validated augmentation engine, like AI, the user must be properly trained on the tool.
ChatGPT, sorry, it’s about us, not you.
ChatGPT is a generative language model, meaning it will predict the next word – even if it’s not correct. Input great questions, great answers emerge. Put garbage in, garbage will come out. ChatGPT also makes errors. It makes errors different than those humans might make, and it identifies different things that humans might miss. By providing the user with guidance on using ChatGPT, the elusive 1 + 1 = 3 equation can be achieved, where the human and the AI augment each other to enhance their respective abilities.
Importantly, AI is not just a technical tool, it’s also a clinical tool. Hence, augmenting the technology with clinical expertise is essential. Optimal augmentation can only occur by educating the clinician about how the tool works. So, how do we train healthcare users? Radiology’s first-mover experience with AI offers insights:
Access to data: ChatGPT needs access to patient data to provide a relevant output, and access to data can both be easy and extremely thorny. Let’s start with easy. ChatGPT only works with text, so medical images cannot be input or output. For ChatGPT to understand patient outcomes, or extract patient medical information, it will need access to electronic health records, which traditionally is a tough sell for those controlling the data. However, most radiology practices have access to their own radiology reports, which contain a wealth of information that can be used both for ChatGPT extraction and education. ChatGPT cannot “learn” more without data input to do so. So, in radiology, let’s start with reinforcement learning using our own radiology reports.
Optimize the AI accuracy: AI overall, and ChatGPT specifically, is an immature tool, and with immaturity comes a spectrum of human reaction – from fear of using technology to hype that it can do anything. As a generative large language model, ChatGPT may confabulate information, providing results that sound reasonable but are incorrect. Hence, like other AI tools, ChatGPT’s output should be overseen by experts in that field – so, radiologists should validate ChatGPT’s radiology-relevant output. ChatGPT should not make patient care decisions autonomously and clinicians should be educated about its limitations. ChatGPT confabulation can be limited, and accuracy improved through reinforcement training with additional data. Since ChatGPT works optimally when asked the right question in the right way, users would benefit from an improved understanding of “prompt engineering,” the science of crafting the question to achieve the optimal output. Asking ChatGPT to not only provide peer reviewed references for its output but also to validate the accuracy of those references can aid physician oversight. Optimized physician-ChatGPT collaboration is only achievable with dedicated physician preparation and education.
Identify use cases: With ubiquitous access, endless possibilities, and countless ideas about how AI tools can be used, it is crucial to identify use cases that align with medical standards and can be effectively managed. By doing so, ChatGPT deployers can establish trust with users and ensure that AI serves as an effective augmentation tool.
FDA oversight and regulation: ChatGPT is not designated as a medical device, but it could be used in a way that could trigger FDA oversight if it is utilized to provide specific patient care recommendations to the user. Hence, it is crucial for users to have a clear understanding of the FDA’s definition and ensure they’re not using ChatGPT as a regulated medical device.
Legal, ethical and responsible use: The challenges in this arena are complicated, have long-term implications, and present more questions than answers. As we wait for medical and societal norms, regulations, and guidelines to emerge to address issues such as the responsible application of AI and determining liability if ChatGPT contributes a medical error, prioritizing user training and use cases is a good starting point.
Transparency: Humans want to be in the loop when AI is used. In this regard, it is crucial to ensure that AI models are transparent, and their outputs are understandable to the user. For a language model such as ChatGPT, the reader should be made aware when the AI model is used so accuracy and reliability can be judged in context.
While there are significant challenges that need to be addressed to safely integrate ChatGPT into a physician’s workflow, none of them are insurmountable, especially given our collective experience in demystifying AI in radiology. So, where do I think ChatGPT can augment a radiologist?
Improving the radiology report: There are a host of ways ChatGPT can improve the radiology report. Radiologists can use ChatGPT to improve the efficiency of reporting by dictating only the positive findings or dictating a prose report and allowing ChatGPT to craft a complete structured report. If connected to the EMR, ChatGPT could auto-populate information from the patient’s problem list, physician notes, surgical and pathology reports, and/or prior radiology reports into the report header. ChatGPT can also be used to translate a radiology report, optimized for the reader. For example, ChatGPT can create a patient-friendly version of a radiology report, customize it based on what an emergency doc or neurology specialist needs, or modify it to a format requested by an individual physician. By offering quality control oversight, suggesting relevant positive and negative findings, and providing radiologists with appropriate follow-up recommendations based on population health best practices, ChatGPT can also enhance the quality of the radiology report.
Improve patient information rads receive: ChatGPT can summarize prior radiology reports, medical notes, laboratory data, and provided history to give radiologists a far more complete patient historiological than we have had since the advent of the digital age.
Identify best practices: While ChatGPT can’t yet review the medical images, it can display the latest best practice recommendations tailored to the patient based on reported pathology, making the use of best practices more ubiquitous across the specialty.
Flag billing requirements: To receive reimbursement, radiologists are required by Medicare to adhere to complex and evolving requirements that are arduous and inefficient for the radiologist to manage. Enter ChatGPT to flag Medicare requirements and prompt radiologists to include them in their reports helping reduce physician burden.
Create CPT code mapping from radiology reports: It is clear we don’t capture all the billing codes for the diagnostic reports and imaging procedures radiologists perform today, meaning we are doing work for which we aren’t getting paid. However, with the application of reinforcement learning, particularly in intricate sub-specialties like interventional radiology, ChatGPT could help guarantee that radiologists are properly reimbursed for their services.
Validate AI performance: ChatGPT can be utilized to review radiology reports and estimate the accuracy of computer vision AI models on a large scale. Inconsistencies between ChatGPT and the outputs of the AI computer vision models can be resolved by subject matter experts. As it is impractical for individuals to validate computer vision models on such a large scale, ChatGPT’s capabilities can be leveraged to streamline the validation process.
Improve comment letters to government agencies: Comment letters are a primary way the FDA and Medicare seek stakeholder input making this correspondence extremely important and painstakingly time consuming for physicians. ChatGPT could create efficiencies, especially for entities with fewer resources, by writing first drafts of comment letters. Importantly, this assistance can markedly improve the volume of input on proposed rules, regulations, and guidance, hopefully making government agencies more connected to its constituents.
The potential of ChatGPT is exciting. Radiologists have come a long way on the AI journey. What started as a fear from some that AI would replace us, has evolved to a more nuanced understanding that AI’s greatest contribution in medical imaging is to make us better radiologists? However, this outcome will only be realized if we understand the intricacies of the technology and use it appropriately.