• Work on these subjects has taken on considerable importance given the human and social risks posed by the use of machine learning techniques, in particular deep neural networks or, more recently, large language models.
What is explainability and why do we need it?
Here, in the context of diverse definitions, we’re putting forward that of standard ISO 22989 on AI terminology. This reference defines as the property of an AI system to express important factors influencing the AI system results in a way that humans can understand. But this definition can be expanded to rational elements to explain the behaviour of the system beyond just the results.
In the same standard, the transparency of an AI system is the property whereby appropriate information about the system is made available to relevant stakeholders. This can include its characteristics along with elements of explanations, its limitations or its design choices.
Interpretability, on the other hand, refers to the aim for a target audience to understand the behaviour of the system, with or without the use of explainability methods.
The studies show that explainability does not constitute a systematic trust factor and can even give rise to mistrust.
Several reasons have been advanced to justify the importance of these topics and their integration into responsible AI:
- AI-based systems can malfunction and produce erroneous results. Thus it is useful and even necessary to understand why, in order to improve or better define the scope of the system and allow AI results to be used more effectively.
- Users or anyone concerned should be able to understand or obtain explanations of the results produced by these systems which affect them.
For example, the introduction of AI into companies marks the start of a continuous process of learning and adaptation. AI has overturned how work is organised. The findings of LaborIA Explorer offer recommendations on empowering social and technological dialogue in favour of the so-called“enabling” integration of AI systems into the world of work. One of these recommendations is to make AI systems “explainable”, to allow decision-makers and users to understand how they work and have confidence in the results generated.
- Compliance and accountability, for example in the case of malfunction leading to accidents, is impossible if there is no understanding of how the system operates.
- Algorithms and learning data may be tainted by social biases that need to be identified and eliminated. In this sense, explainability can play a part in understanding the origin of such biases without replacing specific bias management methods (see A Critical Survey on Fairness Benefits of Explainable AI).
The limits and risks of explainability in terms of trust, manipulation and security
Though the aim is to instil trust and confidence in users, the relationship between this and explainability is not, however, systematic. Some studies (Kästner et al) even tend to show the opposite, namely, a loss of trust when explanations are provided to users or phenomena of excessive trust that can be detrimental.
More precisely, two opposing trends have been observed: If an AI-based system offers forecasts and explanations in line with the user’s preconceived ideas, the latter risks placing too much trust in those forecasts. If the system proposes forecasts and explanations that run counter to the user’s preconceived ideas, there is a risk that the user could mistrust the prediction. These studies show that explainability does not constitute a systematic trust factor and can even give rise to mistrust. Therefore we need to pay close attention to the context in which the system is being used and evaluate these risks to determine the level of explainability necessary and the objectives it hopes to achieve.
Moreover, although providing additional information on the operation of AI systems to make them more explainable can have real advantages, it can also create new risks of distortion. As demonstrated by researchers Erwan le Merrer and Gilles Trédan, explanations can be manipulated. It is in fact very easy for a malicious entity to falsify the explanations of its decision-making algorithm in the same way as a night club bouncer can say he can’t let you in because you’re not dressed right! The real reason can always be masked by another, seemingly plausible, explanation. To do this, all that’s required is to create a new decision-making model able to generate the same conclusion as the first, but concealing its true reason behind something else that gives the impression of a legitimate decision (for example, by omitting the real reason for being turned away from the night club, hiding its actual nature in discriminatory truth). The aim of falsification is to give the impression that the “black box” model behaves correctly, whereas this is not necessarily the case.
Although explainability helps to combat suspected injustice, it can also increase the risks in terms of security or exploitation of potential system vulnerabilities by users. This is the AI transparency paradox, which is proving to be a limitation to consider when designing AI systems. In fact, the publication of additional information can leave AI vulnerable to attack: by understanding AI’s “reasoning”, hackers will be better able to fool the algorithm.
Another concern is the protection of proprietary algorithms, as researchers have demonstrated recently that entire algorithms can be stolen simply by studying their explanations. This means that an attacker who accesses the predictions and explanations of an algorithm via an API request can deduce from it a set of learning data sufficient to allow them to reconstruct a faithful and effective copy of the original decision-making model.
Finally, disclosures can make companies more sensitive to legal proceedings or regulatory actions.
Legal obligations for explainability
From a regulatory standpoint, explainability is included in several statutory texts and can take different forms.
First of all, in the EU’s General Data Protection Regulation, when an AI system uses personal data, the right to information and the principle of transparency apply. Under this regulation, the person responsible for collecting and possibly annotating data, and for using this to train an AI model, is obliged to provide relevant information on the conditions under which this data is processed and on the collection sources in particular.
In the healthcare sector, the French law on bioethics has introduced a reporting obligation for professional designers and users of artificial intelligence. Regarding algorithms used by the government, in France, the law for a Digital Republic requires authorities to publicly list all the algorithmic tools they use and to publish their rules.
In the EU AI Act, transparency and reporting obligations for users are mentioned for high-risk systems with a new explainability right for those persons concerned, for AI systems with human interaction, such as chatbots. And for suppliers of general use models, a technical documentation obligation is imposed. Explainability may also play an important role in defining liability between actors in the event of AI-generated damage.
To comply with AI regulations, organisations must also adapt to the local context in terms of how their products are used.
Regulations differ depending on geographical area and sector of application. For example, in the US, banks are obliged to provide a reason for refusing credit (Equal Opportunity Credit Act). In the recruitment sector, New York’s Automated Employment Decision Tools (AEDTs) law promotes transparency and equality with a pre-deployment audit obligation. In China, there is a law setting up a register of recommendation algorithms with explanations required.
Conclusion
Although AI explainability allows important factors to be expressed, influencing the results of a system in a comprehensible manner to meet the needs of people in general, operators, users and auditors, certain limits in terms of trust and security should be taken into consideration.
Therefore we need to pay close attention to the context in which the system is being used and evaluate these risks to determine the level of explainability necessary and the objectives it hopes to achieve.
86: the number of the article in the AI regulation the covers the new right to explainability
Read more :
And don’t forget our other articles on this subject:
Explainability is defined as the property of an AI system to express important factors influencing the AI system results in a way that humans can understand.