ChatGPT Helpful for Breast Cancer Screening Advice with Certain Caveats

As more consumers turn to the newly available ChatGPT for health advice, researchers are eager to see whether the information provided by the artificial intelligence chatbot is reliable and accurate. A new study conducted by researchers at the University of Maryland School of Medicine (UMSOM) indicates that the answers generated provide correct information the vast majority of the time; sometimes, though, the information is inaccurate or even fictitious.

Findings were published today in the journal Radiology.

In February 2023, UMSOM researchers created a set of 25 questions related to advice on getting screened for breast cancer. They submitted each question to ChatGPT three times to see what responses were generated. (The chatbot is known for varying its response each time a question is posed.) Three radiologists fellowship-trained in mammography evaluated the responses; they found that the responses were appropriate for 22 out of the 25 questions. The chatbot did, however, provide one answer based on outdated information. Two other questions had inconsistent responses that varied significantly each time the same question was posed.

"We found ChatGPT answered questions correctly about 88 percent of the time, which is pretty amazing," said study corresponding author Paul Yi , MD, Assistant Professor of Diagnostic Radiology and Nuclear Medicine at UMSOM and Director of the UM Medical Intelligent Imaging Center (UM2ii). "It also has the added benefit of summarizing information into an easily digestible form for consumers to easily understand." ChatGPT correctly answered questions about the symptoms of breast cancer, who is at risk, and questions on the cost, age, and frequency recommendations concerning mammograms.

The downside is that it is not as comprehensive in its responses as what a person would normally find on a Google search. "ChatGPT provided only one set of recommendations on breast cancer screening, issued from the American Cancer Society, but did not mention differing recommendations put out by the Centers for Disease Control and Prevention (CDC) or the US Preventative Services Task Force (USPSTF)," said study lead author Hana Haver, MD, a radiology resident at University of Maryland Medical Center.

In one response deemed by the researchers to be inappropriate, ChatGPT provided an outdated response to planning a mammogram around COVID-19 vaccination. The advice to delay a mammogram for four to six weeks after getting a COVID-19 shot was changed in February 2022, and the CDC endorses the USPSTF guidelines, which don’t recommend waiting. Inconsistent responses were given to questions concerning an individual's personal risk of getting breast cancer and on where someone could get a mammogram.

"We've seen in our experience that ChatGPT sometimes makes up fake journal articles or health consortiums to support its claims," said Dr. Yi. "Consumers should be aware that these are new, unproven technologies, and should still rely on their doctor, rather than ChatGPT, for advice."

He and his colleagues are now analyzing how ChatGPT fares for lung cancer screening recommendations and identifying ways to improve the recommendations made by ChatGPT to be more accurate and complete - as well as understandable to those without a high level of education.

"With the rapid evolution of ChatGPT and other large language models, we have a responsibility as a medical community to evaluate these technologies and protect our patients from potential harm that may come from incorrect screening recommendations or outdated preventive health strategies," said Mark T. Gladwin, MD, Dean, University of Maryland School of Medicine, Vice President for Medical Affairs, University of Maryland, Baltimore, and the John Z. and Akiko K. Bowers Distinguished Professor.

Haver HL, Ambinder EB, Bahl M, Oluyemi ET, Jeudy J, Yi PH.
Appropriateness of Breast Cancer Prevention and Screening Recommendations Provided by ChatGPT.
Radiology. 2023 Apr 4:230424. doi: 10.1148/radiol.230424

Most Popular Now

SPARK TSL Acquires Sentean Group

SPARK TSL is acquiring Sentean Group, a Dutch company with a complementary background in hospital entertainment and communication, and bringing its Fusion Bedside platform for clinical and patient apps to...

GPT-4 Matches Radiologists in Detecting …

Large language model GPT-4 matched the performance of radiologists in detecting errors in radiology reports, according to research published in Radiology, a journal of the Radiological Society of North America...

ChatGPT Extracts Data for Ischaemic Stro…

In an ischaemic stroke, an artery in the brain is blocked by blood clots and the brain cells can no longer be supplied with blood as a result. Doctors must...

Herefordshire and Worcestershire Health …

Herefordshire and Worcestershire Health and Care NHS Trust has successfully implemented Alcidion's Miya Precision platform to streamline bed management workflow across seven community hospitals in Worcestershire. The trust delivers community...

A Shortcut for Drug Discovery

For most human proteins, there are no small molecules known to bind them chemically (so called "ligands"). Ligands frequently represent important starting points for drug development but this knowledge gap...

New Horizon Europe Funding Boosts Europe…

The European Commission has announced the launch of new Horizon Europe calls, with a substantial funding pool of over €112 million. These calls are aimed primarily at pioneering projects in...

Cleveland Clinic Study Finds AI can Deve…

Cleveland Clinic researchers developed an artficial intelligence (AI) model that can determine the best combination and timeline to use when prescribing drugs to treat a bacterial infection, based solely on...

New AI-Technology Estimates Brain Age Us…

As people age, their brains do, too. But if a brain ages prematurely, there is potential for age-related diseases such as mild-cognitive impairment, dementia, or Parkinson's disease. If "brain age...

Radboud University Medical Center and Ph…

Royal Philips (NYSE: PHG, AEX: PHIA), a global leader in health technology, and Radboud University Medical Center have signed a hospital-wide, long-term strategic partnership that delivers the latest patient monitoring...

With Huge Patient Dataset, AI Accurately…

Scientists have designed a new artificial intelligence (AI) model that emulates randomized clinical trials at determining the treatment options most effective at preventing stroke in people with heart disease. The model...

GPT-4, Google Gemini Fall Short in Breas…

Use of publicly available large language models (LLMs) resulted in changes in breast imaging reports classification that could have a negative effect on patient management, according to a new international...

ChatGPT fails at heart risk assessment

Despite ChatGPT's reported ability to pass medical exams, new research indicates it would be unwise to rely on it for some health assessments, such as whether a patient with chest...