Concerns over 'Exaggerated' Study Claims of AI Outperforming Doctors

Many studies claiming that artificial intelligence is as good as (or better than) human experts at interpreting medical images are of poor quality and are arguably exaggerated, posing a risk for the safety of 'millions of patients' warn researchers in The BMJ.

Their findings raise concerns about the quality of evidence underpinning many of these studies, and highlight the need to improve their design and reporting standards.

Artificial intelligence (AI) is an innovative and fast moving field with the potential to improve patient care and relieve overburdened health services. Deep learning is a branch of AI that has shown particular promise in medical imaging.

The volume of published research on deep learning is growing, and some media headlines that claim superior performance to doctors have fuelled hype for rapid implementation. But the methods and risk of bias of studies behind these headlines have not been examined in detail.

To address this, a team of researchers reviewed the results of published studies over the past 10 years, comparing the performance of a deep learning algorithm in medical imaging with expert clinicians.

They found just two eligible randomised clinical trials and 81 non-randomised studies.

Of the non-randomised studies, only nine were prospective (tracking and collecting information about individuals over time) and just six were tested in a 'real world' clinical setting.

The average number of human experts in the comparator group was just four, while access to raw data and code (to allow independent scrutiny of results) was severely limited.

More than two thirds (58 of 81) studies were judged to be at high risk of bias (problems in study design that can influence results), and adherence to recognised reporting standards was often poor.

Three quarters (61 studies) stated that performance of AI was at least comparable to (or better than) that of clinicians, and only 31 (38%) stated that further prospective studies or trials were needed.

The researchers point to some limitations, such as the possibility of missed studies and the focus on deep learning medical imaging studies so results may not apply to other types of AI.

Nevertheless, they say that at present, "many arguably exaggerated claims exist about equivalence with (or superiority over) clinicians, which presents a potential risk for patient safety and population health at the societal level."

Overpromising language "leaves studies susceptible to being misinterpreted by the media and the public, and as a result the possible provision of inappropriate care that does not necessarily align with patients' best interests," they warn.

"Maximising patient safety will be best served by ensuring that we develop a high quality and transparently reported evidence base moving forward," they conclude.

Myura Nagendran, Yang Chen, Christopher A Lovejoy, Anthony C Gordon, Matthieu Komorowski, Hugh Harvey, Eric J Topol, John P A Ioannidis, Gary S Collins, Mahiben Maruthappu.
Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies.
BMJ 2020. doi: 10.1136/bmj.m689.

Most Popular Now

ChatGPT can Produce Medical Record Notes…

The AI model ChatGPT can write administrative medical notes up to ten times faster than doctors without compromising quality. This is according to a new study conducted by researchers at...

Alcidion and Novari Health Forge Strateg…

Alcidion Group Limited, a leading provider of FHIR-native patient flow solutions for healthcare, and Novari Health, a market leader in waitlist management and referral management technologies, have joined forces to...

Can Language Models Read the Genome? Thi…

The same class of artificial intelligence that made headlines coding software and passing the bar exam has learned to read a different kind of text - the genetic code. That code...

Study Shows Human Medical Professionals …

When looking for medical information, people can use web search engines or large language models (LLMs) like ChatGPT-4 or Google Bard. However, these artificial intelligence (AI) tools have their limitations...

Advancing Drug Discovery with AI: Introd…

A transformative study published in Health Data Science, a Science Partner Journal, introduces a groundbreaking end-to-end deep learning framework, known as Knowledge-Empowered Drug Discovery (KEDD), aimed at revolutionizing the field...

Bayer and Google Cloud to Accelerate Dev…

Bayer and Google Cloud announced a collaboration on the development of artificial intelligence (AI) solutions to support radiologists and ultimately better serve patients. As part of the collaboration, Bayer will...

Shared Digital NHS Prescribing Record co…

Implementing a single shared digital prescribing record across the NHS in England could avoid nearly 1 million drug errors every year, stopping up to 16,000 fewer patients from being harmed...

Ask Chat GPT about Your Radiation Oncolo…

Cancer patients about to undergo radiation oncology treatment have lots of questions. Could ChatGPT be the best way to get answers? A new Northwestern Medicine study tested a specially designed ChatGPT...

Wanted: Young Talents. DMEA Sparks Bring…

9 - 11 April 2024, Berlin, Germany. The digital health industry urgently needs skilled workers, which is why DMEA sparks focuses on careers, jobs and supporting young people. Against the backdrop of...

North West Anglia Works with Clinisys to…

North West Anglia NHS Foundation Trust has replaced two, legacy laboratory information systems with a single instance of Clinisys WinPath. The trust, which serves a catchment of 800,000 patients in North...

Can AI Techniques Help Clinicians Assess…

Investigators have applied artificial intelligence (AI) techniques to gait analyses and medical records data to provide insights about individuals with leg fractures and aspects of their recovery. The study, published in...

AI Makes Retinal Imaging 100 Times Faste…

Researchers at the National Institutes of Health applied artificial intelligence (AI) to a technique that produces high-resolution images of cells in the eye. They report that with AI, imaging is...