Premium Only Content

AI Agents that “Self-Reflect” Perform Better in Changing Environments - Stanford HAI
🥇 Bonuses, Promotions, and the Best Online Casino Reviews you can trust: https://bit.ly/BigFunCasinoGame
AI Agents that “Self-Reflect” Perform Better in Changing Environments - Stanford HAI
Who would you pick to win in a head-to-head competition — a state-of-the-art AI agent or a mouse? Isaac Kauvar, a Wu Tsai Neurosciences Institute interdisciplinary postdoctoral scholar, and Chris Doyle, a machine learning researcher at Stanford, decided to pit them against each other to find out. Working in the lab of Nick Haber, an assistant professor in the Stanford Graduate School of Education, Kauvar and Doyle designed a simple task based on their longtime interest in a skill set that animals naturally excel at: exploring and adapting to their surroundings. Kauvar put a mouse in a small empty box and similarly put a simulated AI agent in an empty 3D virtual arena. Then, he placed a red ball in both environments. Kauvar measured to see which would be the quicker to explore the new object. The test showed that the mouse quickly approached the ball and repeatedly interacted with it over the next several minutes. But the AI agent didn’t seem to notice it. “That wasn’t expected,” said Kauvar. “Already, we were realizing that even with a state-of-the-art algorithm, there were gaps in performance.” The scholars pondered: Could they use such seemingly simple animal behaviors as inspiration to improve AI systems? That question catalyzed Kauvar, Doyle, graduate student Linqi Zhou, and Haber to design a new training method called curious replay, which programs AI agents to self-reflect about the most novel and interesting things they recently encountered. Adding curious replay was all that was needed for the AI agent to approach and engage with the red ball much faster. Plus, it dramatically improved performance on a game based on Minecraft, called Crafter. The results of this project, currently published on preprint service arXiv , will be presented at the International Conference on Machine Learning on July 25. Learning Through Curiosity It may seem like curiosity offers only intellectual benefits, but it’s crucial to our survival, both in avoiding dangerous situations and finding necessities like food and shelter. That red ball in the experiment could be leaking a deadly poison or covering a nourishing meal, and it would be difficult to find out which if we ignore it. That’s why labs like Haber’s have recently been adding a curiosity signal to drive the behavior of AI agents and, in particular, model-based deep reinforcement learning agents. This signal tells them to select the action that will lead to a more interesting outcome, like opening a door rather than disregarding it. Read the full study, Curious Replay for Model-based Adaptation But this time, the team used curiosity for AI in a new way: to help the agent learn about its world, not just make a decision. “Instead of choosing what to do, we want to choose what to think about, more or less — what experiences from our past do we want to learn from.” said Kauvar. In other words, they wanted to encourage the AI agent to self-reflect, in a sense, about its most interesting or peculiar (and thus, curiosity-related) experiences. That way, the agent may be prompted to interact with the object in different ways to learn more, which would guide its understanding of the environment and perhaps encourage curiosity toward additional items, too. To accomplish self-reflection in this way, the researchers amended a common method used to train AI agents, called experience replay. Here, an agent stores memories of all its interactions and then replays some of them at random to learn from them again. It was inspired by research on sleep: Neuroscientists have found that a brain region called the hippocampus will “replay” events of the day (by reactivating certain neurons) to strengthen memories. In AI agents, experience replay has led to high performance in scenarios where the environment rarely changes and clear rewards are given for the right behaviors. But to be successful in a changing environment, the researchers reasoned that it would make more sense for AI agents to prioritize replaying primarily the most interesting experiences — like the appearance of a new red ball — rather than replaying the empty virtual room over and over. They named their new method curious replay and found that it worked immediately. “Now, all of a sudden, the agent interacts with the ball much more quickl...
-
35:54
The Mel K Show
8 hours agoMel K & Tim James | Healing is an Inside Job | 9-14-25
62.5K4 -
3:06:33
IsaiahLCarter
11 hours ago $9.40 earnedCharlie Kirk, American Martyr (with Mikale Olson) || APOSTATE RADIO 028
69.8K19 -
16:43
Mrgunsngear
15 hours ago $10.18 earnedKimber 2K11 Pro Review 🇺🇸
50.4K14 -
13:40
Michael Button
1 day ago $3.17 earnedThe Strangest Theory of Human Evolution
46.8K22 -
10:19
Blackstone Griddles
1 day agoMahi-Mahi Fish Tacos on the Blackstone Griddle
32.6K3 -
23:51
Jasmin Laine
1 day ago“Stop Wasting My Time!”—Trump's BRUTAL WARNING To Canada As Poilievre ROASTS CBC LIVE
24.8K29 -
9:54
Millionaire Mentor
1 day agoNBC Host EXPOSES JB Pritzker For Saying This About Trump
16.2K13 -
1:35:39
SB Mowing
2 days agoIt took the WHOLE NEIGHBORHOOD to uncover this yards SHOCKING SECRET
99.1K65 -
12:52
ROSE UNPLUGGED
1 day agoFrom Vision to Legacy: Charlie Kirk
62.9K22 -
1:14:22
Jeff Ahern
12 hours ago $12.56 earnedThe Sunday Show with Jeff Ahern
87.5K39