Khiops: the solution for identifying SoHos

How can we reach small companies without a formal status? SoHos (Small offices, Home offices) are companies that are not registered anywhere and there is no formal way of reaching out to make them targeted offers. Orange has developed a more effective machine learning approach to identify them sooner.

Khiops: Orange’s machine learning solution for identifying SoHos, small informal companies in Africa and the Middle East

It is difficult for operators to offer micro enterprises services tailored to their needs. Although their requirements and use of digital technologies are unique and often intensive, it is nevertheless worth targeting offers and services at these unregistered businesses.

“Orange sees the informal sector as a promising area of growth, given the strong entrepreneurial spirit in emerging markets. This sector is mostly made up of very small companies that naturally rely on consumer products. This project is part of the search for a win-win formula for the operator and SoHo entrepreneurs through classic telephone service provision, which would lay the groundwork for services such as security, cloud access and methods of payment, etc.”, says Ismaïl Rebai, the Head of Analytics at eLob.

To tackle the problem of identifying these informal companies, a machine learning solution was developed by Orange subsidiaries in Africa and the Middle East in cooperation with the “Emerg Data” research project and the “SoHo In Retail Acquisition” (SIRA) programme. Its name: Khiops.

Identifying SoHos

In marketing terms, these small companies, also known as SoHos (Small offices, Home offices) are a specific target. This is a relatively overlooked segment in need of targeted offers, as a large number of customers are individual entrepreneurs or telecommuters. The basic idea behind Khiops is to gather usage data on these companies in order to draw up their behaviour profile and incorporate these potential B2B customers into our existing client base. This would lay the groundwork for an active campaign to identify and target such companies.

Global methodology

The operations management approach to identifying prospective B2B customers comprises several stages, each involving various stakeholders:

  • First, the Business Intelligence (BI) and Information Technology (DSI) teams of Orange’s subsidiaries have to work with the marketing team to select the relevant data:
    • Choosing a number of standard (non-professional) and known professional customers from within the client base
    • Collecting usage data.
  • The data science team conducts an analysis to create a model for identifying professional customers.
  • The model can then be used to attribute scores to the entire customer base so as to identify customers who behave like professionals.
  • The data science team sends an interpretation of the model with high-scoring customers (i.e. customers who are most likely to be professionals) to the marketing team.
  • The marketing team can then decide whether the model should be used, in which case it designs a campaign with a specific script for determining whether the potential customer is indeed a professional and what offer should be made.

The success of this process depends on several factors: the quality of the input data (relevance of the samples, the availability of detailed usage data and availability of a large enough group of SoHos), the performance of the learning algorithm and the design of the campaign script. To evaluate the success of the entire process, the positive detection rate (prospective customers who are actually professionals), the proportion of successful calls and the increase in sales are measured.

Data mining

Telecommunications data and data from Orange Money transactions are used for identifying prospective B2B customers from the customer base in the data mining phase. To recap, detailed call records or call reports summarise the customer’s activity in terms of voice calls, SMS messages, data sessions and top-ups of pay-as-you-go accounts. Raw Orange Money data includes detailed transactions of all types (cash-in, cash-out, peer-to-peer transactions, merchant payments, etc.).

“This means that the solution is capable of generating data sets quickly. Each country can supply its own particular data set: detailed monthly data, or rather aggregated data, on the use of telecommunications and banking services”, explains Romain Trinquart, Head of the Emerg Data research project.

Score assignment

The goal is to produce two types of results for each country. The first provides a measure of performance and an outline of the main discriminating variables. The second assigns scores to all customers in the database after explaining the results to the data mining and/or marketing team in the country.

Finally, the teams at Orange Research plan to expand the use of scores from individuals already in the customer base to those appearing as contacts, either in call reports or in electronic payment transactions, to make marketing campaigns more effective. As part of this programme, the Africa and Middle East marketing departments and eLob (Enterprise Line of Business) will continue their efforts to spread this approach to other countries.

Khiops’ 4 strong points

  • The solution has no parameters and produces first-rate results in terms of performance and resistance to data noise.
  • It has the capacity to process detailed raw data, such as call reports, rather than aggregate values.
  • It allows the model to be interpreted and amended.
  • It handles large sets of data with ease.


Read also on Hello Future

With or without generative AI, aspiring developers still have to master the basics


E-Reo, Turnkey Digital Applications in Indigenous Languages


Agtech, agricultural IoT and the threat of cyberattacks: how should the risks be modelled?


Science and technology: should we trust them? – Étienne Klein

GettyImages - vagues de chaleur et technologie - heatwaves and technology

Heatwaves: how technology is affected by soaring temperatures


Bridging or widening the digital divide: the challenge of AI in Africa

Mathilde Saliou - photo crédit © Jean-François Paga, chez Grasset

M. Saliou (Technofeminism): “Technology imposes a deterministic approach”


New technologies: pushing the boundaries of Art