• A research project undertaken by the start-up Osmo, with support from Google, has recently broken new ground using a machine learning based on graph neural networks (GNNs).
• The new approach has led to the creation of a predictive model that can map molecular structure to odour descriptors.
Today’s artificial intelligence systems have accomplished a lot, notably learning how to recognise objects, faces, sounds, voices, and tactile signals. However, the digital processing of olfactory information remains in its infancy. “Cameras and microphones are cheap and pervasive, so it’s easy to collect new data for vision and hearing. Collecting data for scent either involves using complex and expensive instruments or slow and arduous training processes for human scent testers,” explains Alex Wiltschko, a neuroscientist, and the founder of Osmo Labs. A start-up launched in January 2023, Osmo Labs was spun off from a research project conducted by Google Brain (Google’s AI division that has since been rechristened DeepMind), which sought to apply neural networks technology to olfactory data.
The starting point was an industry dataset of molecular structures and descriptive terms for 5,000 known odorants
Data in the form of nodes and links
In early September 2023, researchers from Osmo published an article in the journal Science, detailing a model that mapped molecular structure to odour perception using graph neural networks (GNNs), which are particularly suited to work on olfaction because they represent data using nodes and links, like atoms and bonds in molecules. “Graph neural networks (GNNs) are fairly new models and in recent years they have gathered a lot of attention from researchers, not just in olfaction, but also in chemistry and biochemistry” explains Matej Hladis, a doctoral student at the Nice Institute of Chemistry (which is jointly managed by the CNRS and Côte d’Azur University) and the co-author of another ground-breaking research paper on this topic published in February 2023. “Previously, we had to encode this graph structure to a set of numbers, usually by calculating physico-chemical properties of the molecule, but it was cumbersome, and some information was lost in the process.”
A dataset of approximately 5,000 molecules
The starting point for the model developed by scientists from Osmo and from the Philadelphia-based Monell Chemical Senses Center, which specialises in research on taste, smell, and mucous membrane sensitivity, was an industry dataset that included molecular structures and descriptive terms for 5,000 known odorants. Of these, 80% were used for training purposes, with the remaining 20% reserved for subsequent testing to see if the model would correctly link molecules to their descriptions. Thereafter the model’s performance was compared to the results obtained by a panel of human participants.
“We trained a cohort of subjects to describe their perception of odorants using the rate-all-that-apply method (RATA) and a 55-word odour lexicon,” explain the researchers in their article. “During training sessions, each term in the lexicon was paired with visual and odour references.” The cohort had already been vetted in a pre-test phase to ensure the final selection of 15 subjects were able to detect 20 common odorants.
Once they had mastered the lexicon, which included terms like ‘tropical’, ‘animal’, ‘acidic’, ‘spicy’, ‘mint’, ‘green’, ‘sulphurous’, ‘roasted’, the 15 subjects were given 400 odorants for which they had to provide descriptions and intensity ratings on a scale from 1 to 5. The AI model was also evaluated on the same odorants, to see how well it was able to predict the correct descriptors for each of the molecules, which were wholly unseen given that it had not been exposed to any of them during its training (in the end, 323 of these tests were retained for analysis).
Potential to revolutionize the classification of odorants
Predictions generated by the model were closer to the mean provided by the panel of human testers than those provided by any individual human participant. The team also compared the model’s performance to responses from the median panellist in each test and found that it more accurately represented the group average 53% of the time. For the research team, these promising findings will now pave the way for new approaches to neuroscientific and biochemical research on olfaction, which may ultimately revolutionize frameworks for the classification of odours and our understanding of brain functions required to recognize them.