How do self-driving cars perceive their immediate environment and respond to it faster than humans? This capability is made possible by edge AI: a technology that allows machines to make local decisions without access to the cloud. Edge AI is based on a simple design principle, which is to locate AI model training and inference as close as possible to data sources. It makes use of AI in combination with edge computing on the periphery of networks.
This proximity has many advantages: it cuts down latency, guarantees resilience in degraded environments, and reinforces privacy by keeping sensitive data local
Instead of sending huge flows of data to remote datacenters, artificial intelligence is trained and deployed on devices close to final users, whether they be smartphones, onboard automotive systems or servers at customer sites. This proximity has many advantages: it cuts down latency, guarantees resilience in degraded environments, and reinforces privacy by keeping sensitive data local, while enabling real-time decisions. Along with operational benefits, edge AI has a major impact on power use. Avoiding data transfers to the cloud limits the use of power-hungry infrastructure. Running AIs on optimized local architectures reduces the carbon footprint of systems while maintaining competitive performance.
Inference performed directly in the devices
In self-driving cars, these systems must analyze data from multiple sensors, detect obstacles, classify objects and compute trajectories in just a few milliseconds. The cloud can’t safely meet this need for fast processing. With edge AI, inference, which allows models to make predictions based on new data, is performed onboard the vehicle or in a nearby datacenter, minimizing latency and ensuring data integrity, while reducing network traffic, which cuts energy use.
To perform tasks, edge AIs use machine learning techniques that are optimized for local challenges. One of these optimizations is pruning which removes low-impact connections from neural networks to reduce the size of models without affecting their efficiency. Another widely used method is quantization, which rounds off parameter values in models so that they can be downsized for lightweight power-efficient hardware.