• Model compression is proposed as a key solution for deploying neural networks on devices with limited hardware resources, such as AI capable PCs or edge computing devices.
• The adoption of co-design frameworks for hardware and software architectures will play an essential role in the drive to optimise the fairness and performance of AI models. The integration of non-volatile memory (NVM) devices and noise reduction in neuromorphic systems are also promising avenues for future development.
Algorithmic biases are usually held to blame for issues that call into question the integrity and reliability of artificial intelligence systems. However, we tend to forget that hardware may also contribute to such problems. And this is increasingly the case now that AIs are being deployed on platforms with limited resources, such as edge computing devices, AI PCs which are making inroads in the medical world, and a new generation of AI smartphones. In a recent study of the relationship between deployment platforms and the design of deep neural networks (DNNs), Hardware design and the fairness of a neural network, a team of researchers from the University of Notre-Dame (Indiana, USA) has demonstrated the extent to which hardware does affect fairness in AI results. And as Yiyu Shi, a professor of computer science and engineering at the University of Notre Dame and a co-author of the article in question explains, this finding is all the more important now that “more and more users want to run models outside of the cloud for reasons of data confidentiality, which is particularly the case in the European Union.”
The risk is that hardware will affect the performance of models in a manner that is not consistent for different demographic groups
Limiting bias by compressing models
“We investigated the relationship between hardware and fairness by conducting a series of experiments using different hardware setups, particularly focusing on CiM [editor’s note: compute-in-memory] architectures.” In large models, the number of synaptic weights, which runs into billions, can slow down processing and affect the energy efficiency of systems, which is why manufacturers have designed specific chips to cope with this workload, but they are not ideal. “The risk is that hardware non-idealities such as variations during the programming process will affect the performance of the model in a manner that is not consistent for different demographic groups and exaggerate fairness issues.” And that’s not all, because in theory this type of variation in results is impossible to control. In a bid to overcome this challenge, which could have huge implications in future scenarios where AI systems become ubiquitous, the research team evaluated several potential strategies to counterbalance the shortcomings of studied hardware architectures. To avoid widely varying results, “model compression is one of the most effective approaches to enable the deployment of neural networks on peripheral and mobile devices with limited hardware resources,” points out Yiyu Shi. For the researcher, it is better to compress large language models so that they fit onto devices like smartphones, rather than to install small models. Previous research by Yiyu Shi’s group has found that the compression process can actually be used to mitigate some fairness issues.
Rethinking hardware and software architectures
It is also crucial for manufacturers to develop architectures that take into account device variability [differences and inconsistencies in performance between memory types, editor’s note] to improve both the accuracy and fairness of models. “Software and hardware must both be taken into account in the design of effective deep neural networks, and this approach can also facilitate the optimisation of CiM accelerators and DNN architecture.” With this in mind, programming of non-volatile memory (NVM) devices (used by CiM architectures) should be carried out at the design stage with a view to improving DNN fairness, and to limit device variability and associated noise.
In the future, transformer-based AI models will continue to run on conventional hardware architectures. “However, if we can advance towards SNN (spiking neural networks) that are designed for neuromorphic computing, we will then have an architecture that is much better suited to AI models.” As it stands, optimizing neural network structures while taking into account hardware constraints but still ensuring fairness remains a challenging process, which is complicated by the supplementary objective of combating bias. It also highlights the need for new design frameworks based on reinforcement and evolution learning algorithms.
Sources :
Guo, Y., Yan, Z., Yu, X. et al. Hardware design and the fairness of a neural network. Nat Electron 7, 714–723 (2024). https://doi.org/10.1038/s41928-024-01213-0
Read more :
Jia, Z., Chen, J., Xu, X. et al. The importance of resource awareness in artificial intelligence for healthcare. Nat Mach Intell 5, 687–698 (2023). https://doi.org/10.1038/s42256-023-00670-0