Trust Your Doctor Over a Chatbot

By John Swartzberg, MD, Chair, Wellness Letter Editorial Board

July 8, 2024

Chatbots can instantly offer answers to your burning questions, but if you’re looking for medical advice, stick with your doctor. That’s my advice after reading a recent study in the American Journal of Preventive Medicine that evaluated two highly popular chatbots—ChatGPT and Google’s Bard (now known as Gemini). Researchers found that when they asked the bots a list of medical questions, the answers were sometimes accurate, and sometimes incomplete or just plain wrong.

Chatbots are driven by artificial intelligence technology that allows them to quickly generate responses to questions or prompts that the user supplies. As the name implies, they are also able to convey that information in simple, conversational language. How? In a nutshell, chatbots are “pre-trained” with massive amounts of data gathered from websites, publicly available databases, and other sources, and this teaches them to both “understand” people’s queries and pull together the relevant information needed to answer them.

ChatGPT, the first AI chatbot to enter the market (in November 2022), made headlines for reportedly acing the college SATs and even passing the U.S. medical licensing exam. Experts warned, however, that chatbots were no replacement for human doctors—and some subsequent studies highlighted that fact. One problem is that while the AI technology can give accurate medical information, a lot depends on the question or prompt the user provides: In general, the more vague or the more complex the question, the less reliable the chatbot’s response.

The new study, which was conducted at the Cleveland Clinic Foundation, adds to the cautionary tale. Researchers came up with 56 questions on guideline-recommended preventive care (like immunizations and cancer screenings) and common health conditions that primary care doctors see. They presented those questions to ChatGPT-4 and Bard, then had two doctors independently review the responses for accuracy. (An answer was deemed “accurate” if it included complete, guideline-based recommendations.)

Overall, the study found, ChatGPT was wrong just as often as it was right: Roughly 29 percent of its responses were accurate—and the same percentage was inaccurate. The remaining responses (about 43 percent) gave correct information but also missed some important details, according to the doctors’ reviews. Bard, meanwhile, performed better but still fell short: Just under 54 percent of its responses were deemed accurate, while about 18 percent were inaccurate and 29 percent left out key information.

Why was one chatbot better than the other? It may come down to their respective wells of knowledge, the researchers say. The Bard/Gemini AI system is continuously updated with additional information, while that’s not true of ChatGPT.

Still, those frequent updates didn’t get all the bugs out of the system. Both chatbots, for example, struggled with guidelines on vaccination: Of six vaccination-related questions, both technologies gave wrong answers to five, and incomplete information on one. Those inaccuracies were largely due to the chatbots providing outdated information on the pneumococcal vaccine.

It’s important to note that this study was conducted in 2023, using the ChatGPT and Bard versions available at the time. AI technology is constantly evolving, which means that current and future iterations of the chatbots should perform a lot better. And some companies are developing AI systems targeted to specific topics, such as medical diagnosis, which should perform better than more generic systems.

That said, AI tools are just that: tools that can gather information quickly and present it to you in a succinct, conversational way. When it comes to the complex matters of human health, remember that chatbots can get it wrong—and even when the information is correct, you may only get part of the story. More importantly, no bot can give you nuanced, personal medical advice. For that, you still need to talk to your doctor.

February 16, 2024
The Annual Wellness Exam: An Ounce of Prevention
On National Public Radio, there used to be two brothers, “Click” and “Clack,” who would talk about cars. Their…
May 15, 2023
Cognitive Testing for Older Physicians: An Idea Whose Time Has Come?
The older I get, the older “old age” gets. Still, as I was reaching my later 60s, my ability to juggle the same…
April 8, 2022
A Diet Tailored to Your Own Particular Biology
Imagine being able to hit a perfect dietary bullseye—learning exactly what foods you in particular should eat, and in…

speaking of wellness

Trust Your Doctor Over a Chatbot

Related Articles