Khiops: Orange’s machine learning solution for identifying SoHos, small informal companies in Africa and the Middle East
It is difficult for operators to offer micro enterprises services tailored to their needs. Although their requirements and use of digital technologies are unique and often intensive, it is nevertheless worth targeting offers and services at these unregistered businesses.
“Orange sees the informal sector as a promising area of growth, given the strong entrepreneurial spirit in emerging markets. This sector is mostly made up of very small companies that naturally rely on consumer products. This project is part of the search for a win-win formula for the operator and SoHo entrepreneurs through classic telephone service provision, which would lay the groundwork for services such as security, cloud access and methods of payment, etc.”, says Ismaïl Rebai, the Head of Analytics at eLob.
To tackle the problem of identifying these informal companies, a machine learning solution was developed by Orange subsidiaries in Africa and the Middle East in cooperation with the “Emerg Data” research project and the “SoHo In Retail Acquisition” (SIRA) programme. Its name: Khiops.
In marketing terms, these small companies, also known as SoHos (Small offices, Home offices) are a specific target. This is a relatively overlooked segment in need of targeted offers, as a large number of customers are individual entrepreneurs or telecommuters. The basic idea behind Khiops is to gather usage data on these companies in order to draw up their behaviour profile and incorporate these potential B2B customers into our existing client base. This would lay the groundwork for an active campaign to identify and target such companies.
The operations management approach to identifying prospective B2B customers comprises several stages, each involving various stakeholders:
- First, the Business Intelligence (BI) and Information Technology (DSI) teams of Orange’s subsidiaries have to work with the marketing team to select the relevant data:
- Choosing a number of standard (non-professional) and known professional customers from within the client base
- Collecting usage data.
- The data science team conducts an analysis to create a model for identifying professional customers.
- The model can then be used to attribute scores to the entire customer base so as to identify customers who behave like professionals.
- The data science team sends an interpretation of the model with high-scoring customers (i.e. customers who are most likely to be professionals) to the marketing team.
- The marketing team can then decide whether the model should be used, in which case it designs a campaign with a specific script for determining whether the potential customer is indeed a professional and what offer should be made.
The success of this process depends on several factors: the quality of the input data (relevance of the samples, the availability of detailed usage data and availability of a large enough group of SoHos), the performance of the learning algorithm and the design of the campaign script. To evaluate the success of the entire process, the positive detection rate (prospective customers who are actually professionals), the proportion of successful calls and the increase in sales are measured.
Telecommunications data and data from Orange Money transactions are used for identifying prospective B2B customers from the customer base in the data mining phase. To recap, detailed call records or call reports summarise the customer’s activity in terms of voice calls, SMS messages, data sessions and top-ups of pay-as-you-go accounts. Raw Orange Money data includes detailed transactions of all types (cash-in, cash-out, peer-to-peer transactions, merchant payments, etc.).
“This means that the solution is capable of generating data sets quickly. Each country can supply its own particular data set: detailed monthly data, or rather aggregated data, on the use of telecommunications and banking services”, explains Romain Trinquart, Head of the Emerg Data research project.
The goal is to produce two types of results for each country. The first provides a measure of performance and an outline of the main discriminating variables. The second assigns scores to all customers in the database after explaining the results to the data mining and/or marketing team in the country.
Finally, the teams at Orange Research plan to expand the use of scores from individuals already in the customer base to those appearing as contacts, either in call reports or in electronic payment transactions, to make marketing campaigns more effective. As part of this programme, the Africa and Middle East marketing departments and eLob (Enterprise Line of Business) will continue their efforts to spread this approach to other countries.
Khiops’ 4 strong points
- The solution has no parameters and produces first-rate results in terms of performance and resistance to data noise.
- It has the capacity to process detailed raw data, such as call reports, rather than aggregate values.
- It allows the model to be interpreted and amended.
- It handles large sets of data with ease.