close
close

This AI continuously learns from new experiences – without forgetting its past

This AI continuously learns from new experiences – without forgetting its past

Our brain is constantly learning. The new sandwich shop is awesome. The gas station? You’d better avoid that in the future.

Memories like these physically wire connections in the brain region that supports new learning. During sleep, the previous day’s memories are moved to other parts of the brain for long-term storage, freeing up brain cells for new experiences the next day. In other words, the brain can continuously record our daily lives without losing access to memories of what happened before.

AI, on the other hand, not so much. GPT-4 and other large language and multimodal models that have taken the world by storm are based on deep learning, a family of algorithms that roughly mimic the brain. The problem? “Deep learning systems with standard algorithms slowly lose the ability to learn,” Dr. Shibhansh Dohare of the University of Alberta said recently. Nature.

The reason for this lies in the way they are built and trained. Deep learning relies on multiple networks of artificial neurons that are connected to each other. When you feed the algorithms data – such as vast amounts of online resources like blogs, news articles, and YouTube and Reddit comments – the strength of these connections changes, so that the AI ​​eventually “learns” patterns in the data and uses those patterns to produce eloquent answers.

But these systems are essentially brains frozen in time. Accomplishing a new task sometimes requires a whole new round of training and learning, undoing everything previously learned and costing millions of dollars. For ChatGPT and other AI tools, this means they become increasingly obsolete over time.

This week, Dohare and his colleagues found a solution to this problem. The key is to selectively reset some artificial neurons after a task, but without significantly changing the entire network – similar to what happens in the brain when we sleep.

When tested with a continuous visual learning task—such as distinguishing cats from houses or distinguishing stop signs from school buses—deep learning algorithms with selective reset easily maintained high accuracy on over 5,000 different tasks. Standard algorithms, on the other hand, quickly deteriorated, and their success rate eventually dropped to about the same as a coin toss.

The strategy, called continual back propagation, is “one of the first of a large and rapidly growing set of methods” to solve the problem of continuous learning, wrote Dr. Clare Lyle and Dr. Razvan Pascanu of Google DeepMind, who were not involved in the study.

Machine Spirit

Deep learning is one of the most popular methods for training AI. These brain-inspired algorithms consist of layers of artificial neurons that connect to form artificial neural networks.

As an algorithm learns, some connections become stronger while others become weaker. This process, called plasticity, mimics how the brain learns and optimizes artificial neural networks so they can provide the best answer to a problem.

But deep learning algorithms are not as flexible as the brain. Once trained, their weights stick. Learning a new task reconfigures the weights in existing networks – and in doing so, the AI ​​”forgets” previous experiences. In typical applications such as image recognition or speech processing, this is usually not a problem (with the caveat that they cannot spontaneously adapt to new data). But when training and using more complex algorithms – for example, those that learn like humans and react to their environment – this is highly problematic.

To use a classic example from the gaming world: “A neural network can be trained to get a perfect score on the video game Pong. But if the same network is then trained to play Space Invaders, its performance on Pong will deteriorate considerably,” write Lyle and Pascanu.

The problem, aptly called catastrophic forgetting, has been troubling computer scientists for years. One simple solution is to start from scratch and retrain an AI from scratch for a new task using a combination of old and new data. While this restores the AI’s capabilities, the nuclear option also erases all previous knowledge. And while this strategy is viable for smaller AI models, it is not practical for large models, such as those that power large language models.

Secure it

The new study adds to a fundamental mechanism of deep learning, a process called backpropagation. Simply put, backpropagation provides feedback to the artificial neural network. Depending on how close the output is to the correct answer, backpropagation optimizes the algorithm’s internal connections until it has learned the task at hand. With continuous learning, however, neural networks quickly lose their plasticity and are unable to learn anymore.

Here, the team took a first step toward solving the problem, using a theory from 1959 with the evocative name “Selfridge’s Pandemonium.” The theory describes how we continuously process visual information and has greatly influenced AI for image recognition and other areas.

Using ImageNet, a classic archive of millions of images for AI training, the team found that standard deep learning models gradually lose their plasticity when faced with thousands of consecutive tasks that are ridiculously easy for humans – for example, distinguishing cats from houses or stop signs from school buses.

By this measure, any drop in performance means that the AI ​​is gradually losing its ability to learn. In previous tests, the deep learning algorithms were accurate up to 88 percent of the time. But by task 2,000, they had lost plasticity and performance had dropped to near or below baseline.

The updated algorithm performed significantly better.

It still uses backpropagation, but with a slight difference. A small portion of the artificial neurons are deleted during learning in each cycle. To prevent entire networks from being disrupted, only less-used artificial neurons are reset. The upgrade enabled the algorithm to handle up to 5,000 different image recognition tasks with an accuracy of over 90 percent.

In another proof of concept, the team used the algorithm to guide a simulated ant-like robot across different terrain and see how quickly it could learn and adapt using feedback.

Through continuous backward propagation, the simulated creature easily navigated a video game road with variable friction – like walking on sand, asphalt and rocks. The robot controlled by the new algorithm managed at least 50 million steps. The robots using standard algorithms crashed much sooner, with performance dropping to zero about 30 percent sooner.

The study is the latest to address the plasticity problem of deep learning.

A previous study found that so-called dormant neurons – those that no longer respond to signals from their network – make AI more rigid, and reconfiguring them during training improves performance. But that’s not the whole story, Lyle and Pascanu wrote. AI networks that can no longer learn could also be due to network interactions that destabilize the way the AI ​​learns. Scientists are still only scratching the surface of the phenomenon.

In practice, the idea is to “keep up with the times” when it comes to AI, says Dohare. Continuous learning doesn’t just mean distinguishing cats from houses. It could also help self-driving cars better navigate new roads in changing weather or lighting conditions – especially in regions with microenvironments where fog can quickly turn into bright sunlight.

Solving the problem “presents an exciting opportunity” that could lead to AI that retains old knowledge while learning new information and adapts flexibly to an ever-changing world like humans do. “These capabilities are critical to developing truly adaptive AI systems that can continue to train indefinitely, respond to changes in the world, and learn new skills and abilities,” Lyle and Pascanu wrote.

Photo credit: Jaredd Craig / Unsplash

Leave a Reply

Your email address will not be published. Required fields are marked *