“Thanks to transfer learning, the AI system will be able to transfer knowledge obtained through carrying out a source task to the execution of a similar task and thus adapt to a great number of use cases.”
Gathering and preparing data for machine learning can be challenging for organisations. What’s more, the development of certain artificial intelligence models consumes a lot of energy. Engineers and scientists are exploring new techniques that make it possible to design robust models with little data, but also with less energy. Welcome to frugal AI.
“It’s not who has the best algorithm that wins, it’s who has the most data.” This aphorism from artificial intelligence (AI) researcher Jean-Claude Heudin shows just how essential data is in machine learning. Deep learning algorithms in particular require a huge amount of annotated data (for supervised learning mode).
However, data production is a long and costly process, which sometimes requires rare expertise (in specialised areas). So in practice, AI engineers and data scientists have to make do with a reduced dataset.
How then is it possible to build robust machine learning models whilst taking their environmental impact into account? Four main paths are being explored.
Transfer learning: making new with old
A quick look at data repositories UCI Machine Learning, VisualData or Google Dataset Search shows that many labelled datasets have been made freely available by public bodies, universities or businesses.
In the area of 3D object detection, Google recently published its Objectron dataset, a collection of 15,000 video clips and millions of images of everyday objects. The objects are all annotated, taken from different angles and have bounding boxes, which describe their position, orientation and dimensions. The dataset comes with pretrained models.
Some datasets become references, such as the GenBank molecular database, which brings together all publicly available DNA sequences, annotated by the United States National Institutes of Health (NIH).
Using these resources, it is possible to do transfer learning, which takes its inspiration from the cognitive process via which human beings apply previously acquired knowledge to new situations.
Thanks to this technique, the AI system will be able to transfer knowledge obtained through carrying out a source task to the execution of a task that is different but similar (target task) and thus adapt to a large number of use cases.
For example, what has been learnt by an algorithm, to recognize cats or to work out if a film review is positive or negative, can be reused respectively to distinguish dogs or to classify product reviews.
This approach is particularly popular in deep learning where pre-trained models are used as starting points for computer vision or natural language processing (NLP) tasks, which are particularly complex and time-consuming.
Curious algorithms and continuous learning
Active learning is used in cases where data is available but where labelling it is expensive.
This model of semi-supervised learning, which is based on the hypothesis that an algorithm performs better if it is “curious”, introduces an oracle (a specialist doctor for example) into the learning process.
Here, it is the algorithm that formulates the queries (meaning it chooses the data to be labelled by the oracle), the principle being to find the most relevant queries so as to maximise information gain. Active learning is extensively used in NLP, which requires a lot of labelled data, of which there is not a lot freely available.
As for incremental learning, this consists in continuously training an algorithm, using data as and when it is received and that is visible only once (this is known as data streams). As opposed to “offline” algorithms where the model is generated from a dataset that is available during the learning phase then deployed on new data, this system continues to learn when it is in production, integrating new knowledge at each increment.
This dynamic approach can be used to solve problems linked to data volume and availability, making it possible to compensate for limited physical resources – such as insufficient memory – which can slow down the learning process.
Environmental efficiency, the second component of frugal AI
Frugal AI comprises a second component: energy efficiency, which is another significant challenge posed by the widespread use of machine learning. In effect, some researchers are now attempting to reduce the electricity consumption of AI systems, in particular that of artificial neural networks, which require phenomenal computing power to process data.
Thus, initiatives have emerged such as the Low-Power Computer Vision Challenge, a yearly competition aiming to improve the energy efficiency of computer vision.
In 2019, researchers from the Allen Institute for AI called for a more efficient and more inclusive AI, thus entering the research field of Green AI (which considers that environmental efficiency, on the same level as accuracy, enables performance assessment of a system), in contrast to Red AI (which seeks to obtain highly reliable results by using massive computing power). This means an AI whose development, training and running costs would be low enough to “enable any inspired undergraduate with a laptop to write high-quality research papers”.
Engraving brain functioning onto electronic circuits
De facto, a less data-intensive algorithm would be less energy-intensive, but the quest for frugality goes even further.
Among the paths being explored is neuromorphic computing, which takes its inspiration from the structure and functioning of the human brain, as this is particularly efficient, to completely rethink the physical architecture supporting deep learning.
Neuromorphic chips, which mimic biological neurons and synapses, are thus the subject of numerous research programmes, which are being carried out notably by electronics and computing industry giants such as IBM, Intel or Qualcomm. Unlike on traditional chips, the computing and memory units are close to one another, which reduces data transfer and therefore energy consumption as well as latency.
IBM’s TrueNorth chip associates 1 million individually programmable neurons and 256 million individually programmable synapses spread over 4,096 parallel and distributed cores that are interconnected in an on-chip mesh network. The chip is said to consume one thousand times less power than a conventional processor of similar size.