Research | Blog | Go in-depth

NORIA: Network anomaly detection using knowledge graphs

Technicians using computer in server room

Friday 29th of March 2024 - Updated on Wednesday 3rd of April 2024

Reading time: 10 min

• The supervision of network infrastructures is a major challenge for a telecommunications operator of the scale of Orange. Historically, Orange has always conducted cutting-edge research to offer innovative solutions in this highly complex area, continually seeking to improve the reliability, efficiency and security of its networks.
• The NORIA project is part of this ongoing quest for innovation, combining artificial intelligence technologies such as the Semantic Web and deep learning to offer a more intelligent, reactive and predictive vision of network management.
• NORIA is a multi-domain incident correlation approach that reduces the cognitive load of network infrastructure administrators. It produces indicators that facilitate incident resolution.
• Knowledge graphs help to acquire an overview of large-scale network infrastructures made up of heterogeneous resources and services generating data (alarms, technical logs, etc.) in rich and varied formats.

This post gives an overview of a possible future for network infrastructure monitoring tools (NMS and SIEM). Boosted by artificial intelligence technologies such as the Semantic Web and neuro-symbolic tools, NORIA (machine learning, Ontology and Reasoning for the Identification of Anomalies) makes it possible to correlate incidents across multiple technical domains (e.g. optical support networks vs IP networks vs Web services) while reducing the cognitive load of network operators.

Developed over more than three years by Orange research in the fields of cybersecurity, networks and AI, this research work has been recognised through numerous scientific publications, demonstrating internationally recognised expertise. NORIA is now in the process of transferring technology and developing a product to meet the operational needs of Orange’s in-house network infrastructure operators. Our researchers have also made several open-source contributions, including:

The NORIA-O ontology https://github.com/Orange-OpenSource/noria-ontology
A component for populating from data streams called SMASSIF-RML https://github.com/Orange-OpenSource/SMASSIF-RML
A semantic bus manager called ssb-consum-up https://github.com/Orange-OpenSource/ssb-consum-up

Symbolic AI: the maestro of network supervision

Operating large-scale network infrastructures (fixed and mobile telephony, Internet access provision, national and international data exchanges) generally involves managing complex situations and anomalies, such as cascading failures, cybersecurity attacks or others, which have in common the simultaneous impact on several platforms, services and network layers. Detecting and diagnosing anomalies in this type of context can be difficult and time-consuming, resulting in long recovery times and the involvement of a large number of collaborators. To manage these situations efficiently, while meeting service quality and security objectives, supervision teams need a global view of the infrastructure and the interactions between the various resources and services that make it up. Obtaining this overall view of the infrastructure is complex, and requires three challenges to be met:

How do you build up an overview of a heterogeneous set of resources and services generating data (alarms, technical logs, etc.) in rich and varied formats? To be effective in the face of this heterogeneity (multiple layers, services and network manufacturers/hardware), the teams in charge of network supervision should have knowledge of many technical areas, including networks over which they do not necessarily have control, and which may be the cause of (or suffer the consequences of) cascading breakdowns.
How do you understand a situation hidden in colossal volumes of events? It is indeed difficult to identify and understand a complex situation in such a context because the supervision operator faces the problem of cognitive overload. Added to this is the reality that there are as many supervision tools as there are technological platforms to supervise, which slows down incident resolution even more. To properly understand (diagnose) and react (correct), you need to be able to have a representation of how the systems work that adapts to each context.
What is the right method to use to identify malicious activity or anomalies that are complex or spread over time? This type of pattern cannot be detected using simple methods (e.g. logical rules, correlation, etc.) and failure to detect them can cause downtime and breakdowns lasting several hours or even days.

To meet these challenges, the NORIA solution proposes to draw on a global and integrated vision of the network life cycle, enriched by indicators generated by artificial intelligence (AI) algorithms in order to deal more effectively with complex situations. This data reconciliation is achieved using a massive knowledge graph structured by an ontology, called NORIA-O, modelling the temporal, structural, procedural and dynamic aspects of networks (see https://hellofuture.orange.com/fr/le-sens-du-sens-les-ontologies-ce-nest-pas-que-de-la-philosophie/ for a more detailed presentation of the concepts of knowledge graph and ontology). Approaches based on deep learning, inference or graph interrogation are used to highlight abnormal situations requiring the operator’s attention.

Unleashing the power of the knowledge graph

In the journey from the data produced by the network to the generation of relevant indicators for the operator, the first step is to represent the network infrastructure using a knowledge graph. This involves modelling and recording all the data needed to operate the networks in a format that can be interpreted simply and unambiguously, whatever the source of the data or the background of the employee consulting it.

To achieve this, NORIA provides a reconciled view of data from multiple platforms, services and network layers. This massive knowledge graph is structured by the NORIA-O ontology (illustrated below with the main concepts of the ontology in yellow), which models the temporal and dynamic aspects (i.e. the activity of the network including events and incident tickets), the structural aspects (i.e. the topology of the infrastructure with its physical and virtual resources and network links), procedural (i.e. the procedures that can be carried out on the network, such as remediation operations) and functional aspects of networks (i.e. the services and applications supported by the infrastructure):

This ontology, available in open source (https://w3id.org/noria/), provides a unified and standardized way of modelling knowledge about network infrastructures. The notion of semantic elevation is applied here to move from a universe in which a concept can be expressed in different ways (depending on data formats, equipment, etc.) to a universe in which knowledge is represented at a conceptual level, abstracting from any syntactic materialization. Furthermore, knowledge graph modelling provides a structured format that can be understood by human operators using the knowledge graph and interpreted by automatic analysis algorithms.

Other strengths of NORIA-O include the use of Semantic Web technologies (including Linked Open Data models and vocabularies) and the contribution of more than 150 network experts from the Orange Group to its construction. NORIA-O thus emerges from a shared and consensual vision of network infrastructures, guaranteeing that it can be reused in numerous use cases and within numerous entities (both within the Orange Group and externally). The ontology is also in line with international recommendations on telecommunications, with alignment on TM-Forum and W3C standards. This last point guarantees the potential impact of this modelling.

From data to knowledge: integrating and mapping network data

Having proposed ontological modelling of network infrastructures, the second stage consists of ingesting and transforming the data in the information system in order to dynamically construct the knowledge graph (which then becomes a digital twin of the infrastructure). To achieve this, NORIA has developed an automatic data integration pipeline based entirely on open-source technologies to which our research teams also contribute (SMASSIF-RML (https://github.com/Orange-OpenSource/SMASSIF-RML) and ssb-consum-up (https://github.com/Orange-OpenSource/ssb-consum-up) are two examples). To go from data sources to salient information provided to operators, NORIA relies on a robust pipeline made up of four stages illustrated in the figure below.

The first data collection stage involves interfacing with data repositories or streaming data in order to extract the raw material needed to build the knowledge graph: resources and topology, logs and alarms, incident tickets, scheduled work, organisational information, etc. This data is then annotated in a second stage. This involves labelling the data with concepts and properties from the ontology in order to map each data source (and each data format) onto the graph. To do this, the RML language (https://rml.io/) is used to construct transformation rules that enable us to move from a heterogeneous world to a world reconciled and standardized in RDF, one of the Semantic Web standards. After this stage, the knowledge graph is a realistic representation of the various aspects of the supervised network infrastructure. The next two stages aim to exploit this structure to best serve the needs of operators. The inference stage aims to consolidate the knowledge graph via validation processes using W3C standards such as SHACL (constraint checking, https://www.w3.org/TR/shacl/) and to enrich the knowledge graph with new facts generated by inference or detection processes. The different approaches used are detailed in the next section of this post. Finally, the last stage sees the operator interrogate the knowledge graph using the SPARQL interrogation language (https://www.w3.org/TR/sparql11-query/) to access the knowledge directly or via the NORIA UI interface for a more intelligible and concise rendering of the information. This graphical interface is a toolbox enabling the operator to efficiently search for weak signals in the large masses of data present in the knowledge graph and then carry out an investigation to pinpoint the cause(s) of the problem.

4,000,000 of facts about the Group’s network infrastructures in the NORIA-KG knowledge graph

Neuro-symbolic AI to the rescue of operators

Having built a knowledge graph mirroring the network infrastructure, the next step is to analyze the knowledge in this graph to help the network expert in his day-to-day work. An anomaly such as a breakdown or malicious activity can originate from several different sources. It is therefore necessary to combine several analysis approaches to help diagnose the cause of the anomaly and choose the appropriate solution to remedy it. The figure below illustrates the three analysis modes currently offered by the NORIA solution.

The aim of the “model-based design” approach is to translate the experts’ knowledge of particular situations into a query that can be used to interrogate the graph. More specifically, it involves encoding business rules into SPARQL queries aimed at identifying certain patterns in the knowledge graph. In the figure opposite, model-based design is used to identify the particular pattern of an application “app_tst” currently supported by two resources “srv_tst_1” and “srv_tst_2”, one of which is faulty (“InterfaceDown” alarm associated with the “srv_tst_1” node). This reason is highlighted in the graphical interface (via the “AtRisk50%” alarm) so that the operator takes this significant information into account. This type of approach can also be used to detect abnormal modifications to user rights or the absence of traffic on a normally active network interface.

The “process mining” approach then aims to give meaning to a sequence of events. With the help of experts or through machine learning, NORIA has built up a database of sequence diagrams that can be searched in the graph to associate a meaning to a sequence of events or, even better, to understand the causes of the phenomenon. In this example, process mining enables us to trace the probable cause of the unavailability of resource ’srv_tst_1’: the ‘TimeOut’ alarm on router ’rt_tsi_1’.

Finally, the last approach, known as “statistical learning”, consists of comparing sub-graphs by performing a mathematical operation of folding the nodes and their relationships into a vector space, which enables this type of comparison to be carried out efficiently while taking into account the semantics of the knowledge graph. More concretely, this type of operation can, for example, be used to compare a new situation for the operator (which therefore constitutes a sub-graph) with similar past situations (which are all sub-graphs) for which a team has been able to propose satisfactory solutions.

Demonstrating NORIA’s capabilities

To make the capabilities of the NORIA approach available to operators, our teams have developed an interface called NORIA UI, the aim of which is to assist operators with the cognitive overload resulting from the complexity of situations and the volume and heterogeneity of events generated by network infrastructures. This interface shapes the future generation of supervision systems to meet the needs of network incident managers and cybersecurity analysts. NORIA UI takes the form of a graphical dashboard offering a dynamic, cross-referenced view of incidents, network resources, services and events.

A demonstration of the tool’s features was given at the Orange Open Tech Days. Watch the video now to see a demonstration of NORIA’s capabilities:

Several features are illustrated, including:

The ability of the solution to produce an overview of a situation described in logs from heterogeneous sources thanks to the reconciliation operated by the data ingestion pipeline and the knowledge graph,
Advanced analysis of the activities taking place on the networks using approaches that take advantage of the structure of the knowledge graph (interrogation through queries), process modelling approaches and machine learning techniques,
The tool’s collaborative and visualization features. To this end, the interface offers a notebook mechanism enabling operators to carry out their investigations more efficiently by storing the entities relevant to the case under investigation in a workspace. They can then share this notebook with other operators via export and sharing mechanisms.

Conclusion

NORIA is paving the way for a future generation of supervision tools based on knowledge graphs and machine learning tools. To achieve this objective, NORIA relies on a strong and rich ecosystem. First of all, NORIA is encouraging the adoption of Semantic Web standards and relying on international standards such as the W3C and the TMForum. In addition, the NORIA project is a consumer of open-source technologies but also, and above all, a contributor. Finally, NORIA also benefits from scientific collaboration with the EURECOM laboratory (https://www.eurecom.fr/) on the subjects of knowledge representation and techniques for detecting anomalies in knowledge graphs.

NORIA’s future is firmly focused on transferring the product of research into a product that can be used by network operators in the Orange Group. Supporting more than fifteen data sources integrated into the multi-faceted knowledge graph, NORIA can offer innovative solutions to several entities, including the Orange France teams and the Orange Cyberdefense teams for Cyber Threat Intelligence use cases.

• [ARES 2023] Lionel Tailhardat, Raphaël Troncy and Yoan Chabot. Leveraging Knowledge Graphs For Classifying Incident Situations in ICT Systems. In 4th International Workshop on Graph-based Approaches for CyberSecurity (GRASEC), 8th International Conference on Availability, Reliability and Security (ARES’23), Benevento, Italy, August 29-September 1st, 2023.
• [ESWC 2023] Lionel Tailhardat, Raphaël Troncy and Yoan Chabot. Designing NORIA: a Knowledge Graph-based Platform for Anomaly Detection and Incident Management in ICT Systems. In 4th International Workshop on Knowledge Graph Construction (KGCW’23), Extended Semantic Web Conference (ESWC’23), Crete, May 28-June 1st, 2023.

Lexicon

knowledge graphs

Representation of a domain of knowledge using a graph in which the nodes are concepts and the arcs are relationships between these concepts.

EURECOM

An engineering school and a research centre in the digital sciences, organised as a Groupement d’Intérêt Economique (GIE), bringing together international academic and industrial partners.