Grigory Antipov embarked on his studies in his native Russia. At Moscow University he first studied applied mathematics and computer science, before looking for information about field that might interest industry. That was in 2012. Back then, there was a lot of talk about data mining and machine learning, two emerging field that looked extremely promising. Grigory specialised with a research Master’s, and it was after these two years spent in France and Spain, determined to pursue a career in research, that the PhD student came across Orange. “When I completed my Master’s in late 2014,” says Grigory, “the research world was about to be submerged in a huge wave of deep learning, a set of machine learning techniques that was revolutionising machine analysis of images and sounds. Orange suggested a topic for a dissertation in this field on a particularly interesting subject as part of a team of experts with big reputations in the field. That was just the opportunity I needed.”
Faces and deep learning
Grigory Antipov has been working as a researcher at Orange for almost three years now. His dissertation (supervised by Orange’s research engineer Moez Baccouche and Pr. Jean-Luc Dugelay of the Eurecom engineering school) focuses on two problems: the semantic analysis of images to recognise the gender and age of an individual from their photo, and how to age a face in a photo or make it look younger, which among other things allows the machine to recognise the same person from two different images. “Existing face recognition engines find it hard to tell if two faces are those of the same person, especially when there is a big age difference between the two images. The ageing of photographs has existed for many years, but the current algorithms simply applied a few filters using an initial photo. In our research, the big challenge is to generate a photo from scratch.” The first results were conclusive. The team won a challenge involving recognising someone’s apparent age launched by ChaLearn, a non-profit that organises international machine learning competitions. Grigory’s research has even been cited in the prestigious MIT Technology Review, a science popularisation blog published by the eponymous Massachusetts Institute of Technology.
Two competing and allied networks
Grigory and his supervisers operate using a library of thousands of “natural” images, mainly of celebrities. The first stage involves teaching the machines using an approach based on Generative Adversarial Networks (GANs). A GAN is a pair of neural networks: in this case a “generator” and a “discriminator”. On the one hand, the “generator” network tries to draw a random human face corresponding to a requested age. On the other is a mix of natural and synthetic images created by the generator, which the discriminator network tries to distinguish from one another. The two networks learn at the same time, but with opposing objectives: the generator tries to fool the discriminator by drawing increasingly realistic faces corresponding to the ages requested, whereas the discriminator finds the differences between the natural and synthetic images of the given age by making the task more complex for the generator network. The two networks are therefore competing – hence the name “adversarial” – but one can’t make progress without the other.
When the learning is complete, the generator network is able to create random human faces corresponding to the ages requested. It can, therefore, make a given person younger or older, or change their gender. To do this, the generator network is given the person’s description using a specific type of encoding. This encoding contains the person’s key information (shape of the face, of the nose, etc.). In his dissertation, Grigory suggests an original method for finding that specific encoding based only on the target person’s photo. As soon as the noise is identified, it is easy to generate photos of the person at all ages.
Any real-world applications?
This research field interests academics and industrialists alike. When it comes to consumer applications, the technology will facilitate parental controls, for example, or help to offer automatic photo organisation systems. As a result, it will be possible to find all the photos of a relative in a flash, from childhood snaps right up to adulthood. Looking further ahead, the ability to age photos will also be interesting in the case of child kidnappings, to create a photo-fit of what the child might look like several years after their disappearance. For advertisers, age recognition is a major challenge, since it will enable them to qualify their audience and adjust broadcast content accordingly. The technology could also be used for advertising screens in public places. “It’s worth noting that the attraction of the system is not to identify people, who remain anonymous, but to obtain information about their age in order to personalise services. More generally, age recognition will also help to perform statistical estimates of a crowd to gather demographic data.”
Fast than the Research
The daily round of a PhD student like Grigory comprises days of tests, in-depth reading, and attending conferences. There’s no shortage of exciting new developments in a very fast-moving field invaded by a plethora of laboratories: “Scientific competition has really gathered pace, and these days, the traditional pace of research is no longer suited to rapid technological change. In the deep learning field, new articles are being published every week. By the time an article has been peer-reviewed and approved by conference committees, new data need to be factored in, which sometimes completely changes the picture. So it’s become essential to circulate pre-publication material on the web. And conferences are no longer places where you discover what your peers are up to, but somewhere to go when you’re already well-informed, to discuss, share, and move your work forward of course, but above all to advance research!”
That’s because, in the scramble for knowledge, researchers the world over also make progress by testing their ideas on each other in a delicate balance between competition and cooperation – the same sort, perhaps, as you see in two adversarial neural networks?