Premium Only Content
Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review)
#gpt4 #ai #prompt
Tree-of-Thought improves prompting of large language models (LLMs) by generalizing the concept of Chain-of-Thought prompting and introduces a tree search across language model thoughts, including state evaluation and backtracking. Experiments on toy tasks show large improvements over both classic and Chain-of-Thought prompting.
OUTLINE:
0:00 - Introduction
1:20 - From Chain-of-Thought to Tree-of-Thought
11:10 - Formalizing the algorithm
16:00 - Game of 24 & Creative writing
18:30 - Crosswords
23:30 - Is this a general problem solver?
26:50 - Ablation studies
28:55 - Conclusion
Paper: https://arxiv.org/abs/2305.10601
Abstract:
Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices. Our experiments show that ToT significantly enhances language models' problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords. For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%. Code repo with all prompts: this https URL.
Authors: Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan
Links:
Homepage: https://ykilcher.com
Merch: https://ykilcher.com/merch
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
-
LIVE
SpartakusLIVE
3 hours agoAim Assist NERFED - I LOVE IT || #1 Spartan Solo Session
254 watching -
49:00
MattMorseTV
2 hours ago $2.38 earned🔴Musk is FINALLY talking about it…🔴
16.7K30 -
LIVE
Sarah Westall
1 hour agoCDC Lawsuit, Genome Sequencing and Automated Medical Doctors w/ Dr. Nick and Leah Wilson
159 watching -
LIVE
Anthony Rogers
10 hours agoEpisode 392 - This is a Podcast
46 watching -
1:19:29
Glenn Greenwald
5 hours agoJasmine Crockett: The Avatar of Democratic Emptiness; Bari Weiss Chooses Fanatical Israel Supporter as New CBS Anchor | SYSTEM UPDATE #556
93.2K35 -
Barry Cunningham
1 day agoLIVE BREAKING NEWS: He's Back!! President Trump Hosts a Rally In Pennsylvania!
38.8K9 -
7:53:09
Dr Disrespect
9 hours ago🔴LIVE - DR DISRESPECT - TARKOV 1.0 - THE VIOLENCE EVOLVES
97.7K7 -
1:40:19
The White House
7 hours agoPresident Trump Delivers Remarks on the Economy
35.1K13 -
LIVE
megimu32
3 hours agoON THE SUBJECT: ULTIMATE 90s Kids’ Christmas List!
103 watching -
LIVE
SilverFox
10 hours ago🔴LIVE - First Time Playing Resident Evil 4 - Part 2 - Come Thru
102 watching