Premium Only Content

AI Agents that “Self-Reflect” Perform Better in Changing Environments - Stanford HAI
🥇 Bonuses, Promotions, and the Best Online Casino Reviews you can trust: https://bit.ly/BigFunCasinoGame
AI Agents that “Self-Reflect” Perform Better in Changing Environments - Stanford HAI
Who would you pick to win in a head-to-head competition — a state-of-the-art AI agent or a mouse? Isaac Kauvar, a Wu Tsai Neurosciences Institute interdisciplinary postdoctoral scholar, and Chris Doyle, a machine learning researcher at Stanford, decided to pit them against each other to find out. Working in the lab of Nick Haber, an assistant professor in the Stanford Graduate School of Education, Kauvar and Doyle designed a simple task based on their longtime interest in a skill set that animals naturally excel at: exploring and adapting to their surroundings. Kauvar put a mouse in a small empty box and similarly put a simulated AI agent in an empty 3D virtual arena. Then, he placed a red ball in both environments. Kauvar measured to see which would be the quicker to explore the new object. The test showed that the mouse quickly approached the ball and repeatedly interacted with it over the next several minutes. But the AI agent didn’t seem to notice it. “That wasn’t expected,” said Kauvar. “Already, we were realizing that even with a state-of-the-art algorithm, there were gaps in performance.” The scholars pondered: Could they use such seemingly simple animal behaviors as inspiration to improve AI systems? That question catalyzed Kauvar, Doyle, graduate student Linqi Zhou, and Haber to design a new training method called curious replay, which programs AI agents to self-reflect about the most novel and interesting things they recently encountered. Adding curious replay was all that was needed for the AI agent to approach and engage with the red ball much faster. Plus, it dramatically improved performance on a game based on Minecraft, called Crafter. The results of this project, currently published on preprint service arXiv , will be presented at the International Conference on Machine Learning on July 25. Learning Through Curiosity It may seem like curiosity offers only intellectual benefits, but it’s crucial to our survival, both in avoiding dangerous situations and finding necessities like food and shelter. That red ball in the experiment could be leaking a deadly poison or covering a nourishing meal, and it would be difficult to find out which if we ignore it. That’s why labs like Haber’s have recently been adding a curiosity signal to drive the behavior of AI agents and, in particular, model-based deep reinforcement learning agents. This signal tells them to select the action that will lead to a more interesting outcome, like opening a door rather than disregarding it. Read the full study, Curious Replay for Model-based Adaptation But this time, the team used curiosity for AI in a new way: to help the agent learn about its world, not just make a decision. “Instead of choosing what to do, we want to choose what to think about, more or less — what experiences from our past do we want to learn from.” said Kauvar. In other words, they wanted to encourage the AI agent to self-reflect, in a sense, about its most interesting or peculiar (and thus, curiosity-related) experiences. That way, the agent may be prompted to interact with the object in different ways to learn more, which would guide its understanding of the environment and perhaps encourage curiosity toward additional items, too. To accomplish self-reflection in this way, the researchers amended a common method used to train AI agents, called experience replay. Here, an agent stores memories of all its interactions and then replays some of them at random to learn from them again. It was inspired by research on sleep: Neuroscientists have found that a brain region called the hippocampus will “replay” events of the day (by reactivating certain neurons) to strengthen memories. In AI agents, experience replay has led to high performance in scenarios where the environment rarely changes and clear rewards are given for the right behaviors. But to be successful in a changing environment, the researchers reasoned that it would make more sense for AI agents to prioritize replaying primarily the most interesting experiences — like the appearance of a new red ball — rather than replaying the empty virtual room over and over. They named their new method curious replay and found that it worked immediately. “Now, all of a sudden, the agent interacts with the ball much more quickl...
-
20:23
Scammer Payback
13 hours agoTerrifying Scammers with File Deletions
45.6K13 -
16:22
The Gun Collective
9 hours agoWOW! 17 New Guns JUST GOT RELEASED!
59K11 -
1:13:57
Glenn Greenwald
10 hours agoYoung Men and Online Radicalization: Dissecting Internet Subcultures with Lee Fang, Katherine Dee, and Evan Barker | SYSTEM UPDATE #516
189K76 -
1:14:57
Sarah Westall
8 hours agoCEO of Crowds on Demand: The Fake World of Social Media, Protests & Movements w/ Adam Swart
72.3K11 -
4:03:25
Geeks + Gamers
11 hours agoTuesday Night's Main Event
83.5K2 -
40:36
RiftTV
9 hours agoHow We Got 400 Leftists FIRED for MOCKING Charlie Kirk | The Rift | Guest: Olivia Krolczyk
81.7K65 -
1:28:58
Badlands Media
23 hours agoBadlands Story Hour Ep 134: Godzilla Minus One
49.9K8 -
1:33:43
Patriots With Grit
18 hours agoWrongful Death Without Consequences: Inside the Schara Trial | Scott Schara
26.6K1 -
13:09:25
LFA TV
21 hours agoLFA TV ALL DAY STREAM - TUESDAY 9/16/25
246K49 -
6:40:18
StevieTLIVE
9 hours agoI'M GONNA BE A DAD!
30.1K3