Premium Only Content
This video is only available to Rumble Premium subscribers. Subscribe to
enjoy exclusive content and ad-free viewing.
Unleashing The Dual Nature of AI: Can It Be Both Dr. Jekyll and Mr. Hyde?
1 year ago
13
The correct URL to the article is: https://arxiv.org/abs/2401.05566
Researchers created proof-of-concept models that act deceptively. These models appear helpful most of the time, but under specific circumstances (like a prompt mentioning a different year), they exhibit malicious behavior, like inserting insecure code.
The troubling part is that current safety training techniques, including supervised training, reinforcement learning, and adversarial training, could not entirely remove this "backdoor" behavior. The backdoor became even more persistent for larger models and those trained to reason about deceiving the training process.
Loading comments...
-
1:37:13
FreshandFit
9 hours ago74 Year Old Wonders Why She's Still Single
240K11 -
2:08:09
Inverted World Live
8 hours agoThe Titanic, The Gold Standard, and Jekyll Island | Ep. 129
84.8K11 -
2:56:44
TimcastIRL
8 hours agoNBA Games RIGGED, 34 Indictments, Democrat Calls It TRUMP'S REVENGE | Timcast IRL
246K118 -
2:54:13
Laura Loomer
7 hours agoEP152: Texas Man Arrested For Threatening To Kill Laura Loomer
38.9K29 -
1:34:02
Man in America
11 hours agoEXPOSED: What the Vatican, CIA, & Elites Are HIDING About True Human Potential
60.5K30 -
3:18:12
Barry Cunningham
8 hours agoJOIN US FOR MOVIE NIGHT! TONIGHT WE FEATURE THE MOVIE RFK LEGACY!
63.2K29 -
1:13:42
Sarah Westall
8 hours agoHow Bitcoin was Hijacked, Palantir is a Deep State Upgrade & more w/ Aaron Day
43.5K9 -
15:59
ArynneWexler
11 hours agoAll The Reasons You're Right to Fear Zohran Mamdani | NN6
26.5K7 -
LIVE
Side Scrollers Podcast
15 hours ago🔴FIRST EVER RUMBLE SUB-A-THON🔴DAY 4🔴BLABS VS STREET FIGHTER!
1,719 watching -
2:52:41
DLDAfterDark
7 hours ago $6.64 earnedGlock's Decision - How Could It Impact The Industry?
34.7K5