Marzyeh Ghassemi worries that explainable AI may make things worse rather than better. Her research suggests explainable AI is perceived to be more trustworthy even when it is in fact less accurate. — ***Marzyeh Ghassemi*** *worries that explainable AI may make things worse rather than better. Her research suggests explainable AI is perceived to be more trustworthy even when it is in fact less accurate.*

Machine learning algorithms have the potential to provide huge benefits in healthcare, potentially providing more reliable diagnoses than human doctors in some cases. However, many of us are reluctant to entrust our health to an algorithm, especially when even its designers can’t explain exactly how it arrived at its conclusion.

This has led to a push for transparency in AI design so that these tools can provide explanations for how they came to their decisions—an idea referred to as “explainable AI.”

However, Marzyeh Ghassemi worries that explainable AI may make things worse rather than better. Her research suggests that explainable AI is perceived to be more trustworthy even when it is in fact less accurate.

“Humans tend to over-trust AI systems in two specific scenarios,” says Ghassemi, a computer scientist specializing in machine learning for health and a faculty affiliate at the Schwartz Reisman Institute.

“Number one is when they believe the machine can perform a function they cannot, and number two is when there’s an expectation that the machine might mitigate some risks. Both of these are very true in many health settings, but particularly in emergency and intensive health settings.”

Thus, providing explanations of algorithmic medical recommendations could result in doctors over-trusting, and hence over-relying on, these recommendations even when they are mistaken.

Ghassemi’s research, which she presented at the Schwartz Reisman weekly seminar on September 30, 2020, looks at how both experts and non-experts make use of medical advice they believe to be machine-generated. Experts tend to rate advice they were told was from an AI as less reliable than advice from a human, whereas non-experts tend to trust both sources of advice equally.

However, experts’ lack of trust turns out to not make much of a difference in how much they are influenced by this advice, or in the accuracy of their resulting recommendations or actions. So the experts—who were suspicious of the AI advice—were equally as likely to base their final decision on this advice as novices who rated the advice as more trustworthy.

While experts were more reliable than non-experts, both groups were equally accurate whether they believed themselves to be relying on AI or human advice. Furthermore, some doctors were particularly “susceptible”—getting the right diagnostic result only when given the correct advice. So, even when experts didn’t trust the AI advice, it could have had a disproportionate effect on their final diagnosis.

Ghassemi’s work relates to a number of intersections between the four conversations which guide the work of the Schwartz Reisman Institute. Her work asks how AI systems can produce the maximum benefit for humanity, but also how these systems can be designed fairly so that disadvantaged communities do not bear an unfair degree of risk when the systems fail, and how they might be used to combat the bias that already exists in the medical system.

Her work, and the work of other Schwartz Reisman researchers, goes beyond the direct consequences of new technology to explore how it is integrated into existing communities, and how it might transform these communities for better or worse.

In the context of Ghassemi’s work, if AI advice is unreliable, even experienced clinicians can be misled into providing the wrong diagnosis. This becomes especially problematic when AI tools are trained on data from certain conditions, and then those conditions change.

“We don’t know what happens when, for example, a new disease comes up or a pandemic occurs,” says Ghassemi, “and suddenly [the AI is] giving the wrong predictions because we haven’t updated our models fast enough.”

Transparency might seem to be a solution to this problem, giving us the ability to know when to trust an AI and when to doubt it. But in fact, Ghassemi suggests the opposite is true. She highlights a study called “Manipulating and Measuring Model Interpretability” that showed people were actually less likely to catch obvious errors made by a transparent model (one which gave a clear weighing formula to justify its decision) than they were to catch the same errors made by a “black-box” model (one which gave no explanation for how its decision was reached.)

Ghassemi argues that we do need a certain kind of transparency, but not the kind traditionally championed by proponents of explainable AI. Instead, doctors need to know when the model is likely to be wrong, perhaps because of the population the AI system was trained on, or due to limitations in the data the system was trained on.

Another argument in favor of transparency is that it can reduce bias in the decision-making. Machine learning can end up encoding biases against vulnerable populations. However, bias is clearly already a part of the medical landscape—even without the use of advanced technologies. As Ghassemi puts it: “Doctors are human, and humans are biased.” And machine learning models can inherit this bias from the data they are trained on. Knowing how a model works will not automatically catch this kind of bias; it requires knowledge of both the way that machine learning models work and a knowledge of the biases present in the clinical data.

Ghassemi argues that we need aggressive audits of the machine learning model to identify potential bias; and this is where transparency can serve a valuable function.

The overall takeaway from Ghassemi’s work is that we need to treat all advice, including AI-based advice, with appropriate suspicion because advice that seems transparent can cause overconfidence.

Instead, AI explanations can and should be used to audit and monitor these systems to catch their flaws and biases.

Want to know more?

Watch the video of Marzyeh Ghassemi’s talk, “Don’t expl-AI-n yourself: exploring ‘healthy’ models in machine learning for health.”
Review the related literature:

“A Review of Challenges and Opportunities in Machine Learning for Health,” M. Ghassemi et al. In AMIA Summits on Translational Science Proceedings, 2020.
“Clinical Intervention Prediction and Understanding Using Deep Networks,” H. Suresh et al. In Machine Learning for Healthcare Conference, pp. 322-337. 2017.
“Clinically Accurate Chest X-Ray Report Generation,” G. Liu et al. In Machine Learning for Healthcare Conference, pp. 249-269. 2019.
“Can AI Help Reduce Disparities in General Medical and Mental Health Care?,” I. Y. Chen et al. AMA Journal of Ethics, 21(2), 167-179. 2019.
“Hurtful Words: Quantifying Biases in Clinical Contextual Word Embeddings,” H. Zhang et al. In Proceedings of the ACM Conference on Health, Inference, and Learning, pp. 110-120. 2020.
“ClinicalVis: Supporting Clinical Task-Focused Design Evaluation,” M. Ghassemi et al. Google Brain Demo, 2018.

About the author

Benjamin Wald is a postdoctoral fellow at the Schwartz Reisman Institute for Technology and Society. His research focuses on moral theory and philosophy of action, with a particular interest in the intersection between the two, and the application of these studies to issues in the ethics of AI and machine learning. He received his PhD in 2017 from the Department of Philosophy at the University of Toronto.