• A team of scientists from the University of Birmingham is pioneering an innovative approach involving explainable machine learning to study these complex dynamics.
• Their research, which highlights hitherto unforeseen environmental impacts, will pave the way for better management of complex ecosystems.
Freshwater ecosystems, and in particular lakes, are under increasing pressure from human activity involving land-use change and the recourse to chemical pollutants. “These disturbances have a severe impact on aquatic biodiversity. However, pinpointing the precise mechanisms underlying changes they engender remains a complex challenge,” explains Dr Niamh Eastwood, a researcher at the Centre for Environmental Research and Justice (CERJ) at the University of Birmingham in the UK. To investigate these dynamics, Eastwood and a team of researchers adopted an innovative machine learning approach that examined environmental DNA (eDNA) in samples collected by the Environment Agency of England. The goal of their study was to identify and monitor drivers of biodiversity with a view to the development of a model with the capacity to predict future impacts, which can be used to inform political decision makers.
The approach offers a more complete and objective view of biodiversity, identifying thousands of species from water and biofilm samples
A novel approach combining eDNA and AI
Traditionally, aquatic biodiversity monitoring has been based on direct observation of samples under a microscope, which has meant that only species that leave visible traces can be detected. The analysis of DNA makes it possible to quantify not just the presence or absence, but also the relative abundance of entire communities that were beyond the scope of previous studies. “This approach offers a more complete and objective view of biodiversity, identifying thousands of species from water and biofilm samples,” points out evolutionary systems biology professor Luisa Orsini. However, the analysis of such large quantities of data requires the development of powerful new tools.
In collaboration with Jiarui Zhou, an assistant professor in environmental bioinformatics at the Centre for Environmental Research and Justice of the University of Birmingham, the team developed a multimodal machine learning pipeline that can integrate different data types, such as biological, chemical, and physical parameters. “Unlike conventional AIs, which are often described as black boxes, our model is designed to be explainable, so as to enable researchers to understand the real drivers of biodiversity,” points out Jiarui Zhou.
Multimodal AI approach highlights unforeseen consequences
“The study identified 43 environmental factors associated with declining biodiversity in lakes. Among the most striking discoveries were the unintended impacts of plant protection products, such as les insecticides and fungicides,” points out Niamh Eastwood. Although they are designed to target specific pests, these substances also have an impact on other untargeted biological groups with unforeseen consequences. Luisa Orsini explains: “For example, some chemicals, now banned in the UK and the EU, have shown persistent negative impacts on aquatic communities, underlining the need for stricter regulations and better risk assessments.” Another remarkable aspect of this research is that it is confirmed by historical observation: the researchers found correlations between data provided by their model and political decisions that have already been taken on the basis of previous environmental sampling. For example, certain chemical products identified as harmful in the study have already been outlawed, which validates the accuracy of the data-driven approach adopted by the Birmingham University team.
Predicting future impacts
One of the long-term goals of this research is to develop digital twins of lake ecosystems with the capacity to model the introduction of new pollutants and changes in land use and predict their impact on biodiversity. Forecasts of this kind should enable regulators to intervene to prevent irreversible damage before it occurs. For example, if they are provided in advance with information on the effects of a novel pesticide, they will be much better able to take action to limit its use and minimize associated environmental risks.
Although their study was wholly focused on lakes, the researchers emphasize that their approach can easily be transposed to model other ecosystems such as oceans and forests. The key to its success lies in the collection of uniform high-quality data, as well as effective collaboration between biologists, computer scientists and political decision makers. “By integrating data from different regions and time periods, it will be possible to build even more robust and generalisable models,” concludes Niamh Eastwood.


