Name: I Was Not Expecting This! 120 BILLION Params, 120 Tokens PER SECOND (feat llama.cpp)
Uploaded: 2025-08-08T05:52:52+00:00
Duration: 6 min 39 s
Description: These speeds alone open the door to so many cool things! Not to mention, gpt-oss:20b is a great model, but why not run its bigger brother at comparable speeds! At least some of the time! Dual 5090s gp

4 months ago

Technology gpt-oss gpt-oss:120b gpt-oss:20b llama.cpp llama-server 5090 nvidia cuda AI neovim

These speeds alone open the door to so many cool things! Not to mention, gpt-oss:20b is a great model, but why not run its bigger brother at comparable speeds! At least some of the time!

Dual 5090s
gpt-oss:120b @ 120 Tokens / second
gpt-oss:20b @ 270 Tokens / second

Loading comments...

Premium Only Content

I Was Not Expecting This! 120 BILLION Params, 120 Tokens PER SECOND (feat llama.cpp)

Comments

Premium Only Content

I Was Not Expecting This! 120 BILLION Params, 120 Tokens PER SECOND (feat llama.cpp)

Tundra Tactical

Thursday Night Gun Fun!!! The Worlds Okayest Gun Show

megimu32

ON THE SUBJECT: CHRISTMAS CORE MEMORIES

Sarah Westall

Humanity Unchained: The Awakening of the Divine Feminine & Masculine w/ Dr. Brianna Ladapo

Glenn Greenwald

Reaction to Trump's Primetime Speech; Coldplay "Adultery" Couple Reappears for More Shame; Australia and the UK Obey Israel's Censorship Demands | SYSTEM UPDATE #560

Barry Cunningham

BREAKING NEWS: President Trump Signs The National Defense Authorization Act | More News!

Donald Trump Jr.

The Days of Destructive DEI are Over, Plus Full News Coverage! | TRIGGERED Ep.301

BonginoReport

The Internet Picks Bongino’s FBI Replacement - Nightly Scroll w/ Hayley Caronia (Ep.200)

Russell Brand

Stay Free LIVE from AmFest — Turning Point USA - SF665

Kim Iversen

Trump Pulls War Fake-Out…Attack on Venezuela Still Coming!?

Redacted News

Putin just changed EVERYTHING with this move and NATO can't do anything | Redacted w Clayton Morris

Comments