Post Content
Even though artificial intelligence (AI) may have unlocked massive potential, it may have a fundamental problem. Once deployed, they stop learning. While children continuously adapt and learn from their surroundings, modern-day AI models seem to be frozen in time. Needless to say, they require armies of engineers to help them retrain when conditions change. In a new paper, AI researchers offer a radical solution based on how humans and animals learn naturally.
The findings are from the paper, Why AI systems don’t learn and what to do about it: Lessons on autonomous learning from cognitive science (March 17, 2026), by Emmanuel Dupoux, Yann LeCun, and Jitendra Malik, affiliated with FAIR at Meta, NYU, UC Berkeley, and École des Hautes Études en Sciences Sociales respectively.
Consider how toddlers learn: they actively explore, switch seamlessly between observing others and trying things by themselves, and at the same time decide on what deems their attention. In case something does not work, they adjust immediately. On the contrary, modern AI systems are simply incapable of doing this. Instead, they depend on what researchers describe as ‘MLOps’ – massive pipelines where human experts collect data, design training modules, and rebuild models from scratch when they fail in new scenarios.
Also Read | Why AI that says ‘you’re right’ could be dangerous, according to a new study
According to the paper, this creates significant limitations. In simple words, AI systems that are trained on internet data perform unpredictably when confronted with real-world situations which highly differ from their training. They are unable to adapt to changing environments or learn from their own mistakes. In these models, notably, all the learning occurs offline before they are deployed and are handled entirely by humans.
Two key learning modes
The paper identified two fundamental learning modes that need to work together. System A – learning from observation, which includes how humans build internal models of the world by watching and predicting. This includes everything from infants learning to identify faces to modern self-supervised learning in AI. At present, GPT’s text prediction or vision models learning from images fall into this category. Their main strength is that they scale well and discover useful patterns. On the other hand, their weakness is that they are disconnected from action and cannot differentiate between correlation and causation.
Meanwhile, System B, which stands for ‘learning from action’, comprises how we learn by trial and error, reinforcement learning, and goal-directed behaviour. To understand this, think of a child learning to walk through repeated attempts. Here the strength is that they are grounded in real consequences and are capable of discovering new solutions. However, the weakness is that they are extremely sample-inefficient and require massive amounts of interaction.
When it comes to biology, these systems work together at all times. System A, which is the visual system, learns compressed representations that make motor planning, System B, tractable. One’s actions generate informative data that improves their perceptual models. Current AI systems treat these as separate domains with rigid and hand-designed integration.
Story continues below this ad
Meta control
The researchers have proposed adding a System M (Meta-Control), which is an organiser that manages learning dynamically. System M essentially monitors internal signals like prediction errors, uncertainty, and task performance and makes meta-decisions. In simple words, System M attempts to understand – what data should I pay attention to? Should I explore or exploit? Should I learn from observation or action right now?
Also Read | Steve Wozniak says AI cannot replace humans, calls current tools disappointing
When it comes to humans and animals, this kind of control occurs naturally. Babies focus on faces and voices, allowing them to learn quickly. Children tend to explore when they are unsure and practise when they are confident. Even in sleep, their brains process and strengthen what they learnt. System M would bring this ability to AI. It would handle tasks that humans currently do such as choosing useful data, adjusting learning steps, and switching between methods of learning. With this, instead of fixed training processes, the AI system would adapt on its own based on whatever it learned.
How to build an AI that learns
In the paper, the researchers propose building autonomous learning systems with a two-timescale approach that is inspired by biology – a developmental timescale and an evolutionary timescale. On a developmental timescale, an AI agent learns during its lifetime, updating Systems A and B through interaction with environments, all organised by a fixed System M.
On the evolutionary timescale, System M itself is optimised across millions of simulated lifetimes. The fitness function here rewards agents that learn quickly and robustly across diverse and unpredictable environments.
Story continues below this ad
According to the researchers, this would require running massive numbers of simulated agents through entire learning lifecycles, which would be computationally demanding but could be transformative. Simply put, just as evolution shaped human learning instincts over hundreds of years, we can use evolutionary algorithms to discover effective meta-control policies.
Why does this matter?
This matters because current AI is failing when deployed outside controlled environments, as they are unable to adapt. Autonomous learning would allow robots that improve from experience, AI systems that handle unexpected situations, and models that learn continuously like humans do.
According to the researchers, the challenges are considerable, as faster-than-real-time simulators with realistic physics and social dynamics, new evaluation methods that test learning ability, and solutions to bilevel optimisation at unprecedented levels would be needed.
However, there are also ethical concerns, as AI systems that learn and adapt on their own could behave in unexpected ways, triggering questions about safety and alignment with human values. The researchers noted these risks but argue that studying autonomous learning is key not just to build better AI but also to better understand human intelligence.