Speed, volume, pitch, tone…:Empath analyses the physical properties of the voice to identify emotions such as joy, anger or sadness.
With the rise of chatbots and personal assistants like Siri (Apple) or Djingo (Orange), with or without a connected speaker, the voice recognition market has grown rapidly in recent years. Most of these apps focus on linguistic analysis. They try to pinpoint the lexicon and grammar in sentences spoken by the user. However, when it comes to communication, words are not everything. Meaning is also carried across by our way of pronouncing words, especially emotional meaning. You do not have to be a seasoned comedian to understand that the same sentence will not convey exactly the same message when whispered hesitantly or shouted loudly.
In the wake of the tsunami
It is this aspect of conversation that the Japanese start-up Empath has chosen to explore. It all began with the earthquake and tsunami that ravaged north-eastern Japan in March 2011. Empath’s Strategy Director, Hazumu Yamazaki says: “While working for a group specialising in medical technologies, our founder Takaaki Shimoji discovered that there were many alternatives for analysing the data concerning the physical condition of the victims of the disaster, but nothing to assess their mental state“. This is how the idea of the Empath project was born, and its first product arrived on the market in 2014.
“Empath is an Emotion AI”, as Hazumu Yamazaki puts it. “By analysing the physical properties of the voice such as speed, volume, pitch, tone, etc., instead of the language itself, our solution manages to identify in real-time emotions such as joy, anger, calm or sadness“. The perspectives for this technology are many. Its ability to enrich and refine our interactions with robots is of interest to specialists in artificial intelligence (AI), health and rescue services, or even commercial call centres. The development kit (SDK) and Empath API have already been adopted by more than 700 customers in 50 countries. And that’s just the start!
Providing for more nuanced behaviour in robots
Today, the start-up is looking to move up a gear, both technologically and commercially. Empath recently joined Orange Fab Asia. This acceleration programme will facilitate its entry into the European market, and France in particular. Hazumu Yamazaki can see his collaboration with Orange expanding in the future: “Voice assistants like Amazon’s Alexa, Microsoft’s Cortana, and Google Assistant are dominant players in the market, while a new form of e-commerce, ‘vocal commerce’, is about to explode in the United States. I think that Orange’s voice assistant, Djingo, could differentiate itself substantially by incorporating Empath. This would enable it to understand the user’s emotions, allowing Djingo to behave in a more human friendly way. Furthermore, our emotional AI could also contribute to the development of voice commerce: in telemarketing, Empath has already demonstrated its ability to increase sales by up to 20%“.
Empath continues to develop its software by focussing on machine learning. “Our strength”, concludes Hazumu Yamazaki, “is that we have the largest user portfolio in the field of emotion recognition from speech. And if we seek to grow internationally, we must also create a global R & D team to advance our Emotional AI. For this, our small team of six needs to expand with experts in AI and affective computing“.