Premium Only Content
This video is only available to Rumble Premium subscribers. Subscribe to
enjoy exclusive content and ad-free viewing.
Unleashing The Dual Nature of AI: Can It Be Both Dr. Jekyll and Mr. Hyde?
1 year ago
13
The correct URL to the article is: https://arxiv.org/abs/2401.05566
Researchers created proof-of-concept models that act deceptively. These models appear helpful most of the time, but under specific circumstances (like a prompt mentioning a different year), they exhibit malicious behavior, like inserting insecure code.
The troubling part is that current safety training techniques, including supervised training, reinforcement learning, and adversarial training, could not entirely remove this "backdoor" behavior. The backdoor became even more persistent for larger models and those trained to reason about deceiving the training process.
Loading comments...
-
LIVE
Candace Owens
51 minutes agoErika And I Sat Down. Here’s What Happened. | Candace Ep 280
15,567 watching -
LIVE
TheCrucible
1 hour agoThe Extravaganza! EP: 75 (12/16/25)
4,215 watching -
LIVE
Kim Iversen
56 minutes agoTurtle Island Terror: A Narrative That Serves Israel
1,167 watching -
LIVE
Redacted News
1 hour agoGet Ready! Something Big is Coming and They're Putting all The Pieces in Place | Redacted News
7,200 watching -
LIVE
Red Pill News
1 hour agoFBI & DOJ Coverup of Clinton Crimes Exposed In Detail on Red Pill News Live
2,999 watching -
LIVE
Robert Gouveia
1 hour agoTrump ILLEGALLY RAIDED!! Judge Dugan Trial! Shame on Tim Walz!
851 watching -
34:04
Stephen Gardner
3 hours ago🔥Democrats SUFFER 2 DEVASTATING Losses to Trump TODAY!
16.8K31 -
1:01:24
vivafrei
3 hours agoRob Reiner Murder BREAKING: Will Son Raise "The Menendez Defense"? Ilhan Omar in BIG TROUBLE & MORE!
110K55 -
22:45
Jasmin Laine
3 hours agoCTV Tries to Trap Poilievre—Carney HUMILIATED as Trump Reality Destroys Months of Spin
10.3K11 -
LIVE
LFA TV
19 hours agoLIVE & BREAKING NEWS! | TUESDAY 12/16/25
1,220 watching