AI: From Perception to Physical Reality

27th January 2025

Submitted by:

Sara Waddington

Generative AI is demonstrating its potential to revolutionise industrial robotics, paving the way for more intelligent, adaptable and autonomous machines. The journey from Perception AI to Physical AI, powered by Generative AI, is chronicled in the February 2025 issue of ISMR by Hannover Messe and marks a pivotal moment in robotics and automation.

====

A modern automotive assembly line is a marvel of modern engineering. Suddenly, it grinds to a halt. What happened? It turns out that a misaligned car door caught the pre-programmed robots off guard. A simple anomaly, but one that is typical of today's factories and highlights the current limitations of industrial robots, is that they excel in controlled environments, but fall short in unexpected situations.

This is where Artificial Intelligence, especially Generative AI, comes in. It promises a new era of intelligent automation, potentially creating a new breed of intelligent adaptable, and autonomous industrial robots.

For decades, industrial robots have improved manufacturing efficiency. But so far, they have been limited to pre-determined actions in highly controlled environments. They are sophisticated tools that lack the intelligence to truly understand and respond to the complexities of the real world. In a recent keynote, Jensen Huang (CEO of leading AI chipmaker NVIDIA) outlined this transformative vision: AI evolving from mere perception to embodying physical action, a concept he calls " Embodied AI" or " Physical AI".

Perception AI in industrial robotics

Perception AI, an earlier form of AI (sometimes intertwined with ‘classical AI’) entailed giving machines the ability to sense and interpret their environment. It gave robots their ‘eyes’ and ‘ears’. For industrial applications, it manifested in several ways as detailed below.

Computer vision: As perhaps the most prominent example, camera-equipped robots could "see" and interpret images. This revolutionised quality control, with robots inspecting products faster and with more precision than humans.
Sensors for object recognition and localisation: Robots started using proximity sensors, laser scanners and RFID readers to identify and locate objects, crucial for pick-and-place and sorting tasks.
Basic navigation for AGVs: Automated Guided Vehicles (AGVs) used sensors to navigate factory floors, automating material transport.
Machine learning for predictive maintenance: Sensors embedded in machinery could feed data (vibration, temperature, sound) to machine learning algorithms, detecting patterns indicating impending equipment failure. This predictive maintenance allowed proactive repairs, minimising downtime.

Perception AI brought significant benefits but also had its limitations. Systems were often designed for very specific tasks and struggled to generalise. Training also required massive amounts of labelled data, which was expensive and time-consuming to acquire. Robots could recognize patterns, but they lacked a deeper understanding of context. They could identify a "screw" but didn't understand its function. Unexpected variations, such as a change in lighting, could easily throw the system off track.

Perception AI was a crucial first step, providing foundational capabilities, but it lacked the flexibility, adaptability and true understanding needed to unlock automation's full potential.

Generative AI (where we are now)

While Perception AI gave robots basic senses, Generative AI is poised to elevate them to a whole new level of intelligence. This technology marks a fundamental shift from simply recognizing patterns to understanding, creating and even reasoning. Unlike traditional AI, focused on analysis and prediction, Generative AI can create new content, analyse intricate patterns and make decisions based on a more nuanced understanding of context.

Concretely, Large Language Models (LLMs), like OpenAI's GPT models, have captured the public's imagination, showcasing Generative AI's power to understand and generate human-like text. LLMs are a crucial stepping stone, demonstrating AI's potential to understand complex information and generate appropriate responses – essential for intelligent robots.

Importantly, Generative AI's core principles are extending beyond language to other modalities critical for robotics. Researchers are developing models that can create and manipulate images, videos, 3D models and even robot trajectories.

We are already seeing early industrial applications of Generative AI, for example:

Generating synthetic training data: Generative AI can create synthetic data that is just as effective (or more so) than real-world data for training, especially for perception tasks.
Optimising designs: Generative AI can explore a vast design space, generating numerous variations of a product or process layout and evaluating their performance.
Simulating complex scenarios: Generative AI allows the creation of realistic virtual environments for testing and training robots in various scenarios, including dangerous or expensive ones.

While still in its early stages, particularly in industrial settings, Generative AI is demonstrating its potential to revolutionise robotics, paving the way for more intelligent, adaptable and autonomous machines. This sets the stage for Agentic and Physical AI.

Agentic and Physical AI (where we are going)

Building on Generative AI, we are now entering an era where the digital and physical worlds blur, giving rise to Agentic and Physical AI – the next frontier in intelligent robotics.

Agentic AI is the crucial step beyond Generative AI. AI systems evolve from passive content generators to active AI agents that can make decisions, plan and pursue goals. Imagine a robot that understands its environment and can formulate a plan to achieve an objective, adapt to changes and learn from experiences. This involves:

Reasoning and planning: Generative AI models will be further developed to enable robots to reason about complex situations, analyse tasks, break them down, determine optimal action sequences and anticipate problems.
Reinforcement Learning (RL) enhanced by GenAI: Generative models can create more realistic and dynamic training environments for RL agents, allowing them to learn complex behaviours and generalise better to real-world scenarios.

Physical AI represents the ultimate goal: the seamless integration of perception, agentic reasoning and physical embodiment. It is about intelligently acting in the physical world, not just reacting to it. Generative and Agentic AI provide the cognitive foundation, the ‘brain’, empowering robots to operate autonomously and effectively in real-world settings.

Several key players are driving this progress, developing the necessary hardware and software ecosystems for training and deploying Physical AI. Examples include NVIDIA's "Project GR00T" which aims to create a general-purpose foundation model for humanoid robots. Its Omniverse Isaac platform allows for simulated training of these robot ‘brains’ and facilitates a connection to their physical counterparts, enabling a closed loop learning system. Similarly, Tesla's Optimus humanoid robot demonstrates significant advancements in dexterity and control capabilities, powered by AI trained on real-world data.

These efforts, along with those of other robotics and AI companies, are collectively pushing the boundaries of what is possible, each contributing unique approaches and technologies to the development of truly embodied AI. This is rapidly accelerating the arrival of a future where Physical AI becomes a realistic scenario.

Transformative impact of Physical AI

Physical AI would unlock a new era of industrial automation. Here are just some examples:

Truly collaborative robots (Cobots): Robots working alongside humans as true collaborators, understanding human intentions, anticipating needs and adapting behaviour accordingly.
Autonomous factories: Factories of the future could be largely autonomous, with Physical AI-powered robots handling tasks from assembly and inspection to logistics and maintenance, with minimal human intervention.
Adaptive manufacturing: Robots quickly reconfiguring themselves to produce different products or handle variations in materials, leading to more flexible and agile manufacturing.
Self-optimising systems: Physical AI robots continuously monitoring their performance, identifying areas for improvement, and optimising actions over time.

Challenges and actions

To read the rest of this article in the February 2025 issue of ISMR, see https://joom.ag/AbMd/p22