Artificial intelligence | Blog | Go in-depth

How to make AI explainable?

A photo of three businesspeople working together in front of a laptop computer.

Monday 10th of February 2025 - Updated on Tuesday 11th of February 2025

Reading time: 5 min

• Have you ever wondered how AI reaches its decisions? The challenge is crucial: to enable humans to understand the results of an Artificial Intelligence system.
• Explainable AI techniques combine methods and processes designed to show the operating logic of an algorithm and provide clear explanations to users of how AI makes decisions.
• There are already myriad techniques depending on context, target audience and the impact of the algorithm. And that’s without factoring in the development of generative AI systems, which raises new challenges in terms of explainability techniques.

Choice of suitable methods

There is no universal method of explainability that is suitable for all types of AI models and data and for all fields of application. The choice of a method will depend on the context, such as the audience targeted by the explanation (AI designers, regulators, customers or end users, business experts) or the level of impact of the algorithm (the explanation of a driverless car accident does not have the same degree of importance as that of a video recommendation algorithm). Legal and regulatory frameworks differ according to geographical zones. [Explainability of artificial intelligence systems: what are the requirements and limits?]

Technological solutions must be built on a case-by-case basis with users.

The operating environment of explainability, for example its mandatory nature for certain critical applications; the need for certification before deployment or the facilitation of use by users are also to be taken into account, as well as the nature of the data (texts, images, tabular data, time series or graphs). The diversity of contexts therefore accounts for the wide variety of technical explainability solutions available (Ali et al.).

However, we can identify several general dimensions to which these techniques belong:

Explainable by design model, designed from the outset to be interpretable (like linear models or decision trees) versus a posteriori explainability (or post-hoc explainability) which is applied once the decision model has been trained.
Local explanation that aims to explain a model’s decision for a particular instance, or global explanation that aims to explain how the model works in general, for all instances.
Static explanation that provides fixed explanations that do not change based on user interaction or interactive explanations that allow users to explore the results, thus providing a more dynamic and personalised understanding of the model’s decisions.

These dimensions are not mutually exclusive. A posteriori methods can be global or local. They can be applied to any type of machine learning algorithm or be specific to a particular architecture.

Explanation produced and its various forms

An explanation produced can take various forms (Bodria et al):

Visual representations in the form of graphs or heatmaps that allow users to visualise which characteristics most influenced the prediction. This type of representation is particularly suitable for visualising areas of interest in images (concepts or sets of pixels). We can cite algorithms such as Grad-CAM (Gradient-weighted Class Activation Mapping) or saliency maps for this type of explanation.
Numerical representations such as importance scores indicating the impact of different variables on the prediction (obtained with algorithms like LIME or SHAP) or examples (prototypes that capture the characteristics of a dataset or counterfactuals that show how changing one or more variables in the example to be explained could have changed the model’s decision). These types of explanations are well suited to tabular data.
Textual representations in the form of verbal descriptions of the reasons behind a decision or logical rules that explain the overall behaviour of the model.
Substitution model simpler than the decision model, directly interpretable (linear model or decision tree for example), which approximates the complex model locally.

The growing popularity and technological promise of generative AI (in particular language models) call for the development of specifically adapted explainability methods. In the case of language models, examples include importance scores that identify the words of a query on which the model is based to provide its response, or methods that identify the most influential training examples used during the model’s learning process. Nevertheless, at the time of writing this article, the explainability of these models is still in its infancy, with the increasing complexity in terms of calculations and the amount of data bringing new technical challenges.

Evaluation of explanations

The evaluation of explanations is also a challenge. Therefore, there is no definition of a “good explanation”. We can only try to list a set of quantitative or qualitative properties that we would like to obtain:

Loyalty: the explanation fits perfectly with what the “black box” model predicts.
Robustness: slight variations in the characteristics of an instance or in the functioning of the model do not substantially alter the explanation.
Realism: the explanation resembles instances of the decision model’s learning base.
Actionability: the user must be able to act on certain characteristics to change the decision.
Finally, comprehensibility is a major challenge in restitution (form, representation of explanations): the explanations generated must be easy to understand for humans likely to need them.

The challenge of causal explanations

One of the limitations of most current explainability methods is that they do not consider potential causal links between different variables. AI systems generally rely on learning correlations between variables, but are not capable of reasoning in terms of causes and effects. This can lead to misinterpretations or unrealistic explanations. To overcome this, a certain number of “causal” versions of existing explainability methods have recently been studied, such as causal SHAP (based on SHAP) or CALIME (based on LIME).

Explainability of optimisation models

The explainability of optimisation algorithms is at the crossroads of the fields of AI and Operational Research (OR). OR algorithms are used to optimise decisions and processes in complex systems using mathematical models (routing a set of trains on a network, planning productive tasks in a factory or optimising a supply chain). Let’s take the case of train routing. It is possible to clearly define the conditions of a feasible routing, such as the non-concomitance of several trains on the same track or delays limited to 10 minutes. The fact remains that the mathematical formulation of an OR problem is often opaque to the end user, especially if this model has thousands of variables and constraints. This is where an OR explainability approach becomes crucial.

One possible approach to making an OR tool more explainable is to incorporate counterfactual explanations. In this context, the decision-maker can question the tool on certain parts of the solution and ask for explanations or even changes. As an example, let’s imagine a decision-maker responsible for planning the daily routes of their delivery lorries. It may be the case that the solution returned by the planning tool is not satisfactory, for example if one of the delivery drivers is assigned a much longer route than the others. The decision-maker can question the tool on the reasons behind this long route and, if necessary, ask what should be changed to the tool’s settings to shorten it (Lerouge et al.).

Within Orange, optimisation algorithms are used in particular to plan certain marketing campaigns. They allow us to allocate relevant communications to our millions of customers every week. Making these allocation algorithms explainable and understandable to marketing teams is therefore a major challenge.

In general, analysing the contexts in which an AI System is used is fundamental to determining what kinds of explanations the potential recipients need and how they should be formatted.

Outlook

One of the possible approaches to implementing explainability is to mobilise user-centric design, an iterative design process that emphasises user involvement at all stages of design and development, whether they are operating the system or end users [A User-Centred Approach to AI Explainability].

https://hellofuture.orange.com/en/for-a-contextual-approach-to-explainability/

https://hellofuture.orange.com/en/behind-the-scenes-of-ai-the-challenges-and-methods-of-explainability/

https://hellofuture.orange.com/en/controversies-around-ai-from-ethical-questions-to-legal-regulation/

Christoph Molnar’s book Interpretable Machine Learning