Researchers highlight concerns over biased data and lack of oversight in AI-powered medical diagnostic tools
AI-powered health apps that offer medical diagnoses at the click of a button could be offering unsafe and inaccurate health advice, according to a new study from McGill University researchers. The study, published in the Journal of Medical Internet Research, found that while these apps occasionally provide correct diagnoses, they often fail to detect serious conditions, which could result in delayed treatment for users.
The researchers tested two popular health apps by presenting them with symptom data from known medical cases. Their findings revealed that while the apps sometimes offered accurate diagnoses, they were also prone to overlooking more severe health issues, raising concerns about their potential to mislead users.
The study identified two primary issues with the apps: biased data and a lack of regulation. According to Ma’n H. Zawati, the lead author of the study and an Associate Professor at McGill’s Department of Medicine, biased data is a significant concern. Many of these apps are trained using datasets that do not adequately reflect diverse populations, leading to skewed assessments.
“These apps often learn from datasets that underrepresent certain demographic groups, including lower-income individuals, and have insufficient racial and ethnic diversity,” said Zawati. “This results in a cycle where the app’s recommendations are based on a narrow sample of users, leading to potential inaccuracies in diagnosis.”
While many health apps carry disclaimers stating they are not substitutes for professional medical advice, Zawati points out that users may not fully understand these warnings or may misinterpret them. The study also highlights another critical issue: the so-called “black box” nature of AI systems. Because these apps evolve with minimal human oversight, their decision-making processes are often opaque, making it difficult for even the developers to understand how a particular diagnosis was reached.
“There is no clear regulation governing these apps,” Zawati explained. “Without oversight, developers are not held accountable for inaccuracies, and doctors are reluctant to recommend these tools. For users, this lack of transparency means a misdiagnosis could be just a click away.”
Zawati, who is also a member of McGill’s Department of Equity, Ethics and Policy, called for stronger oversight and regulation in the development of AI-powered health tools. He suggested that developers could mitigate these risks by training apps with more diverse datasets, conducting regular audits to identify biases, and enhancing transparency in how algorithms work.
“By prioritizing thoughtful design and rigorous oversight, AI-powered health apps could become a valuable tool in clinical settings and help make healthcare more accessible to the public,” Zawati said.
The study urges developers and regulators to prioritize safety and equity as these tools become more integrated into everyday healthcare practices.
For more information, the full study can be found in the Journal of Medical Internet Research.
Reference: Ma’n H. Zawati et al, “Does an App a Day Keep the Doctor Away? AI Symptom Checker Applications, Entrenched Bias, and Professional Responsibility,” Journal of Medical Internet Research (2024), DOI: 10.2196/50344.