• There are many tools to help consumers distinguish between healthy and unhealthy foods, but to date they have not been good at identifying foods that are heavily processed.
• Now a research team has presented a new algorithm trained on nutritional databases, FoodProX, which can automatically determine the level of processing in any food product.
Obesity, type 2 diabetes, high blood pressure, heart disease, cancer, etc. the adverse effects of a poor diet on health are well known and very often linked to a high intake of ultra-processed foods. However, efforts to educate the public and decision makers about these products do not always succeed in providing clear information. Since 2011, the US Department of Agriculture’s Centre for Nutrition Policy and Promotion has published its MyPlate guide, while the government in Brazil has focused on the Nova system developed by the University of São Paulo, which classifies foods into four categories depending on their level of processing. Although it is well respected, to date the widespread adoption of the Nova system has been hampered by the fact that it relies on manual labelling of products. Now in bid to address this disadvantage, a group of America physics and medical researchers have developed a machine learning solution, FoodProX, which they recently presented in the journal Nature Communications.
On the level of government, these tools can be used to identify heavily processed foods and to facilitate action to reduce their consumption.
Four food categories
FoodProx is based on the same classification as Nova, which American researchers have argued does not include sufficient indexed data. The Nova system divides food into four categories (Nova 1 to 4): natural or marginally processed (cut, ground, dried, etc.) foods, processed culinary ingredients (oil, salt, lump sugar, butter), products combining foods from the two first categories and products that have been processed to extend their shelf life (canned foods, cheese, fruit in syrup), and finally, ultra-processed foods (cakes, industrial bread, sauces, pizzas, breakfast cereals, snacks, hot dogs, etc.). As the research paper explains, these are “industrial formulations typically of five or more ingredients including substances not commonly used in culinary preparations, such as additives whose purpose is to imitate sensory qualities of fresh food.” For its part, FoodProX has been trained on nutritional measures for 2484 foods listed in the Food and Nutrient Database for Dietary Studies, a repository maintained by the US Department of Agriculture, which compiles public survey data on Americans’ diets. Each of the foods in the database contains between 65 and 102 nutrients.
A probability rating of 0 to 1 for each food category
Based on its training, FoodProX has the ability to analyse the list of nutrients for any food product and to predict its level of processing. More precisely the classifier generates four probability ratings between 0 and 1 for each of the four food categories in the Nova system before identifying the product as a member of the category for which it has the highest rating. For example, FoodProX predicts that raw onion has a 0.97 probability of belonging to the first “natural foods” category, and a 0.01 to 0.03 probability that it is a member of one of the others. Thus, raw onion is deemed to belong to the first category, while battered and fried onion rings, which receive a rating of 0.99 for the fourth category, are deemed to be ultra-processed.
In the later stages of its development, FoodProX was further tested on the nutritional composition of 100-gram portions of foods listed in another database of Branded Food Products, before being once again tuned on a Food and Drugs Administration corpus, which details the nutrients in 12 grams of each food product. Thereafter, FoodProX was used to classify all of the food products that had yet to be included in the Nova system. It found that over 73.35% of food eaten by Americans is ultra-processed and less than 20% is made up of natural or minimally processed products.
A tool for consumers and decision makers
“First as an individual you can use this tool to identify foods you already eat that are highly processed and search for less processed alternatives,” explains Albert-László Barabási, a co-author of the study and a data science specialist at the Central European University in Budapest (Hungary) and Harvard Medical School. With this in mind, the research team has created the TrueFood website to enable consumers to check the status of a large number of food products sold in the US, which are listed by brand and by retailer. On a more political level, TrueFood can also serve as decision-making tool for public authorities and organizations responsible for public health. “On the level of government, these tools can be used to identify heavily processed foods, and to facilitate action to reduce their consumption using a wider range of instruments, from educational campaigns to tax policy.”