Premium Only Content
This video is only available to Rumble Premium subscribers. Subscribe to
enjoy exclusive content and ad-free viewing.
Unleashing The Dual Nature of AI: Can It Be Both Dr. Jekyll and Mr. Hyde?
1 year ago
13
The correct URL to the article is: https://arxiv.org/abs/2401.05566
Researchers created proof-of-concept models that act deceptively. These models appear helpful most of the time, but under specific circumstances (like a prompt mentioning a different year), they exhibit malicious behavior, like inserting insecure code.
The troubling part is that current safety training techniques, including supervised training, reinforcement learning, and adversarial training, could not entirely remove this "backdoor" behavior. The backdoor became even more persistent for larger models and those trained to reason about deceiving the training process.
Loading comments...
-
LIVE
DeVory Darkins
1 hour agoTrump makes stunning admission regarding the economy as Democrats Suffer MASSIVE MISSTEP
12,047 watching -
LIVE
Dr Disrespect
1 hour ago🔴LIVE - DR DISRESPECT - TARKOV 1.0 - THE VIOLENCE EVOLVES
1,010 watching -
20:43
Stephen Gardner
18 hours agoTrump CRUSHES Tim Walz: Ilhan Omar Fraud Ties Surface
8.96K53 -
1:01:00
HotZone
13 days ago"Teen Trend" Takeovers: Chaos Exploding Across America’s Cities
1781 -
LIVE
Side Scrollers Podcast
2 hours agoNetflix/WB Will RUIN Entertainment + Anita Sarkessian “doesn’t deserve hate” + More | Side Scrollers
522 watching -
UPCOMING
Badlands Media
9 hours agoGeopolitics with Ghost Ep. 63 - December 9, 2025
4.95K1 -
58:19
Timcast
2 hours agoCandace Owens Is A Spiritual Leftist
124K157 -
2:08:18
Steven Crowder
4 hours agoBe Careful - Woke Isn't Dead Yet & Netflix Proves It
342K181 -
LIVE
Barry Cunningham
1 hour agoBREAKING LIVE: President Trump Hits The Road! What To Expect | Time For The Rise Of REAL MAGA!
1,257 watching -
LIVE
Sean Unpaved
2 hours agoKevin Stefanski Names Shedeur Sanders STARTING QB For Rest Of Season! | UNPAVED
97 watching