● Unlike ChatGPT, LightOn guarantees data security and digital sovereignty by deploying its services on-site and not in the cloud.
● For Laurent Daudet, generative AI specializations, which are now emerging in human resources, marketing, and sales, etc. are no longer the sole preserve of IT services
LightOn has announced a dedicated generative AI tool for businesses. How did this project come about?
Initially, the four of us were working on a research project combining expertise in acoustics and photonics, and we were trying to understand how machine learning could be used to improve computational imaging. We ended up inventing a novel method of calculation and our own photonic processor. Thereafter, we built our first language model, christened PAGnol, which we ran on the Jean-Zay supercomputer. At the same time, Jean Zay received a request from Hugging Face, who took the lead in the development of BLOOM, which is a magnificent research project, although it is not designed to go into production in companies.
When OpenAI released GPT3, which can solve tasks for which it has not been previously trained, we realized that large language models were going to have a major impact on all text-based professions. So we changed course in the summer of 2020, and developed our own model, which has 40 billion parameters. The project was not supposed to become a product, but companies were quick to ask us to integrate it.
AI experts are emerging in different customer business units, whether they be in human resources or procurement
What business tasks can be fully automated?
The requests are quite diverse, because companies are seeking gains in productivity in a wide range of areas like marketing, human resources, sales and even R&D. For example, large language models have a better understanding of the context of interaction than traditional marketing systems. They work like chatbots but with context, so these systems can, for example in customer service, classify users’ requests on the basis of previous interaction, which enables them to avail of context.
It is interesting to see AI experts emerging in different business units at our customers, whether they be in human resources or procurement. It is clear that AI is no longer the sole preserve of chief information officers (CIOs). We also act as a consultant for IT services companies, which need to be up to date on these subjects, but our business model is still based on software licensing.
What are the advantages of generative AI tools when compared to the tools that companies have been using up until now?
Tools of this type still work even with data that is not structured. When they don’t, we can show them some examples and they will understand the patterns, whereas with traditional deep learning, we had to provide tools with huge volumes of examples. Sometimes, the model has to be retrained slightly, so as to optimize text prediction. As this training is performed on unannotated data, we are now processing vast data volumes.
To achieve this type of result, our model was trained on Common Crawl [editor’s note: an index of more than five billion web pages], representing a corpus of 500 billion words. Thereafter we perfected our model by removing anything that was not text, all advertising content, all violent or adult content and any duplicates, because exposing the model to the same data more than once results in biases.
What differentiates LightOn from ChatGPT?
On the one hand, we help companies keep control of their costs: businesses using ChatGPT have no idea of what they will have to pay at the end of the month, whereas we market our services in the form of a software licence. At the same time, our service comes with a guarantee that data will remain private. Competing tools do not give clients control over the security and governance of their data, which is critical in sectors like banking, insurance, technology and, of course, healthcare. Our Paradigm architecture is deployed on-site and on customer infrastructure, unlike OpenAI which is a cloud-based solution.