● Having developed methods to measure the carbon footprint of machine-learning tools, Selvan aims to improve AI models to limit their impact on climate.
● Redesigned processing hardware as well as optimized software and software training can contribute to the drive to build high-performance AI tools with lower energy needs.
Is artificial intelligence (AI) adding to the problem of climate change or helping to find solutions to resolve it?
Raghavendra Selvan. As scientists, we aim to make greater use of high-performance tools, but without having them contribute to climate change. In our research, AI models enable us to resolve a wide variety of problems. For example, I am working with a number of climatologists on projects that employ AI for complex tasks like monitoring the development of insect populations with optical sensors, which we do to establish accurate indicators of changes in biodiversity. To this end, we have created machine-learning models that are capable of the unsupervised detection — that is to say without any human supervision — of species groups. Studies of this kind, when analysed in the context of climate change, enable us to be much more precise in our scientific evaluation of its impact. Given that we present AI as a tool for finding answers to environmental challenges, for example to improve climate modelling or optimise energy networks, it is also very important to ensure that it is energy efficient.
The amount of processing power needed to build new machine-learning models is doubling every six months
Why does AI have a much greater environmental impact than previous computer tools?
In the past, when we created computer translation tools, they were based on an understanding of the grammar and syntax of language pairs. Nowadays, we take existing data from both languages, and we ask the tool to deduce its own rules, which requires vast amounts of data and considerable processing power. When it comes to the construction of today’s machine learning models, the amount of processing power needed to build them is exponentially increasing, that is to say that it’s doubling every six months.
What consumes all of this energy?
High-end graphics processors or GPUs, which have traditionally featured in gaming computers, can multiply matrices very quickly, but this requires a proportional increase in energy consumption: we have to train these models on server farms with hundreds and even thousands of GPUs. To give you an example, it took 188,000 kWh to train GPT3. That’s the equivalent of driving for 700,000 kilometres in an electric car. And that was just the energy required to train the model, not counting what is needed to use it. With hundreds of millions of people availing of the service, we can look forward to a huge increase in consumption. And it is impossible to make predictions on this point. In reality, the cost in terms of energy is probably much higher, because models have to be trained and retrained several times before they are fine-tuned. OpenAI has confirmed, for example, that the number of calculations it needs to perform doubles every three and half months.
How can the energy efficiency of machine-learning models be improved?
There are methods to scale down the size of machine-learning models. For example, we know that GPT3 has 175 billion parameters, but the reality is that this number can be reduced. We should also revamp large models to take into account the way are used, and adjust their design to optimize energy use.
In what we call neural architecture search, which is the study and design of artificial neural networks, there are trade-offs. Today our goal is to have very good results for certain tasks like translation: if we want results that are 100% accurate, we know that we’ll need very large language models, but if we settle for accuracy at 99%, then we can really reduce their level of complexity. So we need to reach a compromise between performance and complexity.
Is this energy consumption measurable?
I am part of the team that developed Carbon Tracker, a Python tool that allows us to monitor and predict the carbon footprint of training and developing deep-learning models. The tool, which has been downloaded 65,000 since its launch three years ago, came about as a result of a real need for concrete figures on the impact of AI. It allows us to answer hypothetical questions like “how much energy would be required to run this model over ten years in this or that iteration”. Today, for reasons of competition, some companies don’t want to indicate how their language models are trained. However, the scientific and IT community can go to work on accessible open-source models like LLaMA and Stable Diffusion to ensure they are optimized and trained on machines that consume less energy.
As for hardware…?
The manner in which hardware is deployed can also be improved: in practice calculations that make use of 32 bits can be often be done with only 8 bits, so instead of four GPUs, you can obtain the same results with just one, which has a major impact on the amount of energy needed. Data centres, which generate large amounts of heat, also require a lot of energy for cooling. For example, in university centres, one watt is lost in cooling for every watt used in processing. So there is plenty of scope for optimizing the use of infrastructure. In any case, private companies have a de facto incentive to reduce costs which impels them to monitor power efficiency, not least because of the cost of energy…