Home » News » The Illusion of explanation: Empowering human-in-the-loop

The Illusion of explanation: Empowering human-in-the-loop

In an era where machines increasingly shape decisions, a new narrative unfolds — one that empowers humans in the loop to comprehend the intricate processes steering our lives. Yet, amidst this empowerment, a crucial question arises: do the explanations offered merely soothe our worries or do they genuinely illuminate the situation? Moreover, when individuals act based on these explanations, what will be the outcome? This article introduces the concept of the illusion of explanation, where users find emotional satisfaction in explanations and believe they grasp the processes, despite explanations failing to faithfully reflect the realities of decision-making. Further, the discussion will delve into the intricacies of explainers and explanations themselves, and ways to work around those to harness their gainful advantages.

Within the realm of nontransparent AI models, techniques emerge to extract explanations. While these methods serve a purpose, they grapple with challenges in faithfully portraying reality and the actual decision processes. As AI systems take center stage in critical decision-making, the connection between explanations and real-world actions becomes paramount. We explore this juncture, keeping two focal points in mind: the empowerment of humans in the loop and the translation of such explanations into tangible, real-world actions.

Explainers in the AI Landscape and human-in-the-loop approach

In the dynamic landscape of AI, explainability is intricately linked to the human-in-the-loop approach, aiming to offer users comprehensible insights into the outcomes generated by complex models. It is also there to serve as a bridge for clear communication between the model and the humans involved (e.g., deployers and users). To achieve this, we turn to explainers — tools and methods specifically crafted to demystify the decision-making processes within AI models, offering understanding into key factors that influence outcomes. Explainers, versatile in their application, can also be employed with interpretable models. When coupled with interpretable models, they seamlessly combine an understanding of internal processes with machine inputs and outputs, delivering coherent explanations that foster improved comprehension (for a more in-depth discussion, refer to our article on transparency, explainability, and interpretability article). In essence, these tools craft explanations that spotlight the most pivotal features and variables involved in the model, providing users with a pathway to grasp the essentials of the decision-making process.

Various methods enable explainability with the two primary approaches: post-hoc-based explanations — correlating and analyzing input and output relations — and transparency-based methods, rooted in an understanding of the model’s inner processes (interpretability). Explainer tools employed with black-box models predominantly adopt post-hoc-based explanations, whereas those integrated with interpretable models lean towards transparency-based methods.

As emphasized by Google, explainers play a crucial role in facilitating human understanding and accurate decision-making[1]. This viewpoint underscores that explainers extend beyond mere tools for unveiling AI outputs; they form integral components of a collaborative system where human expertise intersects with AI capabilities. In practical applications, the human-in-the-loop approach ensures users can utilize explainers to interpret and contextualize AI decisions, creating a symbiotic relationship that amplifies the overall effectiveness of the technology.

One prevalent technique in explainability involves offering local explanations for individual predictions, providing valuable insights within a specific context. However, it’s essential to recognize that while these explanations can decode complex AI decisions for specific user cases, they may not construct a holistic, global understanding of the model’s behavior (more in the following section). Examining how non-technical users interact with eXplainable Artificial Intelligence (XAI), researchers pose the intriguing question of whether users overestimate their ability to comprehend complex systems when interpreting additive local explanations[2]. This highlights a challenge for explainers in real-world scenarios, particularly regarding their capacity to navigate the trade-offs between local interpretability and global model comprehension.

As the field progresses, addressing these challenges becomes pivotal to fully unlock the potential of explainers, enhancing AI transparency, and cultivating user trust.

Beyond the Illusion: Navigating the Challenges of AI Explanations

The illusion of an explanation can stem from various factors, both technical and human-centric. On the technical side, certain explanation methods may have local relevance only, involve abstract values that are challenging to translate into real-life actions, or fail to reveal interactions between variables. Simultaneously, human-centric aspects can lead to misinterpretation, misusage, or over-reliance on explanations. While the reasons behind this illusion may vary, users, deployers, as well as developers need to remain vigilant and enabled to discern the actual reasons behind the decisions in question.

I. Technical challenges in Decoding AI Decisions: Example of LIME and Shapley Values

Currently, widely adopted techniques like LIME (Local Interpretable Model-agnostic Explanations) and Shapley values play a vital role in shedding light on the decision-making processes of intricate AI and ML models. While these approaches are beneficial, it’s crucial to recognize their limitations — they serve as complementary tools alongside other evaluation techniques,[1] particularly leveraging human expertise.

On one hand, LIME excels at crafting localized explanations, as seen in evaluating a specific loan applicant’s case. However, the challenge arises from the potential global inconsistency of these explanations. What might negatively impact one applicant’s score may have no effect or even be a positive attribute for another, leading to questions about algorithmic fairness. In finance, this could mean misunderstanding the factors influencing a loan approval decision due to a strong dependence on variables that make sense locally, but not globally. This sensitivity to input variations can significantly impact decision-making outcomes, including when applicants strive to improve their results.

On the flip side, Shapley values provide insights into the “significance” of each variable influencing the outcome, measuring their input one by one. Think of it like evaluating an input of a single football player on a field, contrasting and evaluating the contribution of each one to the team. However, the potential for misinterpretation lies in relying solely on attributions of individual features without considering their interplay. While the computation of Shapley values actually takes interaction of features into account, this aspect is easy to miss when interpreting results. This becomes especially challenging in complex, strongly non-linear decision surfaces where the interaction between features is crucial: it’s the combination of features x AND y which plays a major role, not separate contributions of features x OR y. Yet, the attribution “per feature” presents itself as being additive and independent for each feature.

Both LIME and Shapley explanations face this challenge, demanding a more sophisticated approach with a profound understanding of the models’ processes.

One of the major challenges arises when users try to take actions based on these explanations to improve their results. For example, if individuals learn that factors like debt-to-income ratio or credit history (e.g., credit age) contributed to a denied loan, they might attempt to improve these specific aspects. However, the intricate interactions between these values are not immediately apparent to users (and at times to other stakeholders as well). If actions are taken based on flawed and incomplete understanding, it can lead to undesired outcomes. This means that such explanations, rather than being additive, are more of an interactive nature when viewed in the global, real life context.

II. Explainers and Gardening: What Can Go Wrong?

To illustrate the aforementioned arguments, let’s delve into a simplified analogy involving gardening. Consider a greenhouse where we nurture plants, regulating factors such as greenhouse climate (humidity or dryness) and water supply. To achieve success, a fundamental understanding of the relationships between these factors is crucial; otherwise, attempts to enhance both may result in unintended consequences. For instance, when the climate is dry and watering is low, the plant’s growth is at its minimum. Upon receiving a signal indicating the plant is drying out, individuals with a limited grasp of the interplay between these factors might instinctively try to address both issues by simultaneously increasing watering and humidity variables. However, this well-intentioned approach can lead to overwatering, causing the plant to rot. The same holds true in reverse: if the plant is already overwatered, attempting to rectify it by decreasing both factors may exacerbate the issue.

The challenge with explainers, particularly in black-box models, lies in analyzing these results (interpreted as signals) without full insight into the underlying factors. This issue is exemplified in above-mentioned signals of overwatering or drought in relation to water supply and humidity variables. Looking at the SHAPley values only, may put you in such a situation where you would like to simultaneously address available variables to increase the final score. This common misconception, where variables presented as additive without considering their interplay, is a significant challenge. A synthetic dataset, detailed in the Appendix, has been developed to demonstrate this. Such a limited perspective can result in misinterpretations and misguided actions based on these explanations, leading to undesirable outcomes.

However, please keep in mind that explainers are of tremendous value when these are used in the right way with consideration of their limitations. The challenges discussed emphasize the importance of transparency and communication about the inner workings of these tools and associated models. This ensures that users, deployers, and other stakeholders clearly understand their potential limitations.

This illustration offers a highly simplified glimpse into the complexity of the challenges at hand. In real-world applications, the scenarios are considerably more intricate, involving a multitude of factors, the interplay of which is not always evident. Consider a medical context where a diagnostic model assesses the risk of a rare condition. As it dissects individual factors, such as symptoms attributed to a disease, accompanying explainers may inadvertently oversimplify the situation, overlooking the intricate interactions among these factors and their underlying causes. In such instances, it becomes imperative for the overseeing doctor to comprehend the model’s operations, gaining insights into its internal processes to sidestep potential misinterpretations of the model’s decisions.

III. Balancing and Communication Act: Addressing Human-Centric Challenges in AI Explainability

The discussion about challenges associated with AI goes beyond academic environments, entering the law in full force, with the majority being concerned about the consequences. The call for transparent explanations in decision-making is gaining momentum, underscored by the principles laid out in the latest Executive Order 14110 on Artificial Intelligence and Blueprint for an AI Bill of Rights. This ethical imperative asserts that decisions affecting individuals should be accompanied by clear justifications, aligning with the principles of procedural justice. Amidst the technical challenges discussed earlier, human-centric challenges arise from both technical complexities and the nuances of information digestion. However, crafting a comprehensible and reliable explanation, beneficial for both developers and users, poses a considerable challenge.

For instance, there’s a belief that people naturally explain phenomena through contrastive statements — comparing what happened to what didn’t happen. This human inclination may not align with a model’s data space, which encompasses all possible variations of input data the model might encounter[3]. At the same time, later research pinpoints the cognitive complexity, noting that people prefer a contrastive statement in cases where it follows a negative outcome (e.g., denied loan); further raising questions on what should constitute an explanation[4] . This divergence in human and model thinking underscores the importance for stakeholders to consider these differences when making or accepting algorithmic-based decisions.

A critical challenge in this landscape is misattributed trust (also known as confirmation bias)–having excessive trust for a result produced by a “smarter-than-human” machine–which leads to an over reliance. Linked with the technical challenges discussed earlier, translating methods like Shapley values into practical insights proves more intricate than commonly perceived. Research reveals a troubling trend wherein deployers, lacking accurate mental models, misuse interpretability tools like SHAP, making deployment decisions based on flawed understanding[5]. Such instances result in an overestimation of human understanding — a phenomenon known as the illusion of explanatory depth (IOED) — leading to misinterpretations of explanations. The non-transparency and lack of interpretability of ML systems can potentially contribute to such behavior, since there is no actual possibility for stakeholders to learn about the inner workings.

These challenges emphasize the necessity for a cautious approach, rooted in a profound understanding of human-in-the-loop responsibilities, to prevent overreliance and misinterpretation of AI-generated explanations. This caution becomes particularly crucial in areas like finance and medicine. Beyond practical considerations, these challenges pose a broader philosophical question — what truly constitutes a good explanation — and beckon exploration across disciplines to fathom how humans comprehend reality.

Expert Systems in AI: Balancing Simplicity and Complexity for Enhanced Explainability.

Historically, in the field of artificial intelligence, expert systems stand out for their unique ability to emulate human expertise and reasoning through implementation of logical rules. These systems, structured around a comprehensive set of “if-then” rules, offer a clear parallel to human cognitive processes, making their decision-making pathways more transparent and interpretable. This aspect is crucial in the context of AI explainers, as it facilitates a more in-depth understanding of how AI systems reach specific conclusions, thereby enhancing the trustworthiness and reliability of these systems.

However, the simplicity of expert systems, which is a boon for interpretability, can also be a limitation. As these systems scale and the complexity of their rule sets increases, they can encounter challenges such as handling uncertainty and managing logical inconsistencies. The rigid structure of traditional logical rules often struggles to accommodate the nuances and variability inherent in real-world scenarios. This limitation becomes particularly evident when expert systems are expected to handle ambiguous or conflicting information, a common occurrence in dynamic, real-world environments.

Moreover, the scaling of expert systems is another area of concern[6]. As the number of rules and the complexity of relationships between them grow, maintaining the system’s performance and interpretability becomes increasingly challenging. The system might become a ‘black box,’ ironically counteracting the initial goal of transparency and understandability.

In response to these challenges, contemporary technological advancements have introduced probabilistic rules models as a solution. These models incorporate probabilistic rules, which interact in intricate ways governed by the principles of Markov logic[7], to yield a comprehensive outcome. This approach effectively balances the need for interpretability and the ability to handle complex, uncertain scenarios. By integrating the robustness of probabilistic reasoning with the clarity of logical rules, these models offer a promising avenue for enhancing the explainability of AI systems, ensuring that they remain both reliable and understandable; even in the face of growing complexity and uncertainty.


[1] Google. (n.d.). Introduction to AI explanations for AI platform. Google Cloud. https://cloud.google.com/ai-platform/prediction/docs/ai-explanations/overview

[2] Chromik, M., Eiband, M., Buchner, F., Krüger, A., & Butz, A. (2021). I think I get your point, AI! The illusion of explanatory depth in explainable AI. 26th International Conference on Intelligent User Interfaceshttps://doi.org/10.1145/3397481.3450644

[3] Miller, T. (2019). Explanation in artificial intelligence: Insights from the Social Sciences. Artificial Intelligence267, 1–38. https://doi.org/10.1016/j.artint.2018.07.007

[4] Ramon, Y., Vermeire, T., Martens, D., Evgeniou, T., & Toubia, O. (2021). Understanding preferences for explanations generated by XAI algorithms. SSRN Electronic Journal, 1–18. https://doi.org/10.2139/ssrn.3877426

[5] Kaur, H., Nori, H., Jenkins, S., Caruana, R., Wallach, H., & Wortman Vaughan, J. (2020). Interpreting interpretability: Understanding data scientists’ use of interpretability tools for machine learning. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systemshttps://doi.org/10.1145/3313831.3376219

[6 ] Schaefer, T. (2023, December 20). Transparency in Decision-Making: Advantages of a Rule-Based Approach Part 1 and Transparency in Decision-Making: Advantages of a Rule-Based Approach Part 2.

[7 ] Richardson, M., Domingos, P. (2006): Markov logic networks. Mach Learn 62, 107–136.


For the purpose of simulating the greenhouse gardening scenario and calculating Shapley values, we generate a synthetic dataset. This dataset incorporates variables x1 (representing climate) and x2 (representing watering), with the resulting output y reflecting the growth of the plants (refer to Illustration 1 in the article for a visual representation).

Illustration 1. The visualization of the example’s synthetic dataset.

Understanding the intricate relationships between various factors is paramount to ensuring successful outcomes. Ignoring these relationships can lead to counterproductive results. Consider a scenario where the climate is arid (x1 = -1) and simultaneously the level of watering is low (x2 = -1). In such cases, a plant’s growth is severely hindered. When a signal suggests the plant is drying, those with limited knowledge of the dynamics between these elements might hastily increase both watering and humidity. This reaction, though well-intentioned, could lead to excessive watering and result in plant rotting (y = 0); by pushing the conditions along the diagonal (x1=x2) over the saddle to the other unfavorable side (x1=1 and x2=1). Conversely, if a plant suffers from overwatering, reducing both climate and watering simultaneously might worsen its condition.

Our constructed dataset serves to demonstrate this phenomenon, showing the Shapley values for x1 (climate) and x2 (watering) as x1_impact and x2_impact.

Table 1. Shapley values for the example’s dataset.

Clearly, this example simplifies the actual complexity faced in real-world situations, but it illustrates how important it is to understand the interactions of variables in the decisioning. Typically, numerous variables interact in ways that are not immediately apparent, posing greater challenges in understanding and managing these dynamics.