● The complex task of evaluating its carbon footprint has become a specific focus for academic research.
● The vast amount of power demanded by statistical learning methods in artificial intelligence has made this issue all the more urgent.
Internet users find it hard to imagine that interacting with ChatGPT or watching videos recommended by YouTube generates very real greenhouse gas emissions. For researchers, on the other hand, the carbon footprint created by computers and other digital devices is very much a hot topic in our era of global warming. And they are keen to remind us that fossil fuels (coal, oil, and gas) are burned to generate electricity on the grid where we charge our batteries and plug in our machines, and let’s not forget all the network and Internet infrastructure, most notably data centres that stock our data and applications, which draw vast amounts of power worldwide.
In France, the issue is the subject of a national Digital and Environment Programme launched by the French Institute for Research in Computer Science and Automation in 2022. And the concerns that it has raised are shared around the world, not least because of the staggering scale of digital emissions that we are predicted to generate in the future. Earlier this year, Soumya Sudhakar, Vivienne Sze et Sertac Karaman of the Massachusetts Institute of Technology (United States) presented the alarming results of a model that simulated the potential emissions of onboard data processing in electricity-powered autonomous vehicles, which made extensive use of sensors and artificial intelligence (AI). They notably concluded that computing required by a global fleet of one billion autonomous vehicles would have a carbon footprint at least as big as the one currently generated by all of the world’s data centres.
An issue made all the more urgent by the rise of AI
A study published in mid-February 2023 modelled the emissions generated by machine learning between 2012 (a breakthrough year in the field) and 2021. The two authors, a specialist researcher working for the company Hugging Face and a postdoctoral student at the Quebec Artificial Intelligence Institute, selected 95 ML algorithms mentioned in 77 scientific articles which were drawn from five fields: image classification, object detection, machine translation, conversational agents or chatbots and named-entity recognition (an aspect of natural language processing that consists of classifying words into a categories: people, places, companies, dates, quantities, addresses, etc.).
It’s really hard to gather all the necessary information to perform detailed carbon footprint estimates.
The idea was not to evaluate the exact quantity of carbon dioxide linked to each of these, but rather to outline the main trends. “It’s really hard to gather all the necessary information to perform detailed carbon footprint estimates,” points out Sasha Luccioni of Hugging Face. “AI papers tend not to disclose the amount of computing power used, nor where training was carried out.”
Lower levels of performance do not necessarily imply lower emissions
The project focused on the training phase of the learning models, which required a great deal of computing power. The first finding was that 73 out of 95 models were trained using electricity that was mainly generated from coal, natural gas and oil. By way of illustration, models powered by energy sourced from coal generated an average of 512g of CO2 equivalent per kilowatt-hour, as opposed to 100.6g for those that were mainly powered by hydroelectricity (several greenhouse gases were generated but converted to CO2 equivalent to provide a single figure). Secondly, in this context it is important to note that higher electricity consumption did not necessarily imply a larger carbon footprint, given the low emissions of models running on hydroelectricity. Another finding was that when comparing two models powered by fossil fuels, performance did not necessarily correlate with a lower carbon footprint.
The carbon footprint of machine translation algorithms has been declining since 2019
However, the researchers did not observe “a systematic increase of carbon emissions for individual tasks.” Footprints generated by image classification models and chatbots continued to grow, but those for machine translation algorithms have been declining since 2019.
Nevertheless, the fact that there was an overall increase was undeniable. Learning models generated an average of 487 tonnes of CO2 equivalent in 2015-2016. By 2020-2022, this figure, which was only for training, reached 2020 tonnes. Deployment also had a major impact. A single ChatGPT request can certainly be fulfilled at minimal cost in terms of energy, but the millions of requests directed every day to a constantly growing number of chatbots is much more problematic. “That is what I am working on now,” points out Sasha Luccioni. “However, it remains a complex task, given that the manner in which models are deployed, the hardware used, and scaling, etc. all have a big influence on the energy required and carbon emitted.”