ChatGPT O1 Explained

I reverse-engineered OpenAI's O1-Preview model using O1-Preview! I asked it to generate the full research paper with code and I gave it dozens of related research papers from the past few years as context. It recreated a working version of the O1 model to the best of it's ability and In this video, we'll go over all the details of the model, the code, and the research techniques that make the O1 model series state of art across so many benchmarks. LETS SPREAD THIS AI POWER, I can't wait to see what you think, enjoy!

Code & paper for this video:

Deploy your own AI trading bot (no code):

Want more AI/ML education? Connect with me here:
X:
Instagram:
Linkedin:
TikTok:
Facebook:

⏱️ **Chapters:**
0:00 – Introduction: Reproducing OpenAI's o1 Model Series
1:30 – Generating a Research Paper Using o1 Preview
2:30 – Overview of 'o1-nano': An Open Source, Explainable Model
3:30 – Understanding Chain-of-Thought Reasoning in o1 Models
4:30 – How Reinforcement Learning is Used in Training and Inference
5:30 – Exploring Reasoning Paths and Subtasks During Inference
6:30 – Unpacking OpenAI's Reasoning Tokens
7:30 – Overview of the Model Architecture
8:30 – Core Components: Transformer, Chain-of-Thought Module, Reasoning Token Generator
9:30 – Training the Model to Reason Better Using Reinforcement Learning
13:30 – Historical Papers Leading to o1: Chain-of-Thought and 'Let's Verify Step by Step'
15:30 – The New Scaling Law: Inference Time Scaling
16:30 – The Usage of of Reinforcement Learning
17:30 – Demo of the Code: Running the Test
18:30 – Conclusion: Open Source Code and Research Paper as a Starting Point
19:00 – Closing Remarks and Encouragement to Explore the GitHub Repository

Don't forget to like, share, and subscribe for more deep dives into AI advancements!

I Built a Sports Betting Bot with ChatGPT:

I Built a Trading Bot with ChatGPT:

Watch ChatGPT Build an AI Startup:

Watch ChatGPT Build a Finance Startup:

Watch Me Build a Startup Playlist:

🔔 Subscribe and hit the notification bell to join the AI revolution!

Joe Lilli
 

  • @rahulvmp2050 says:

    Cool

  • @PharoahJardin says:

    Nice !

  • @mercymay42 says:

    Love your videos! 🥰

  • @alexiades says:

    Awesome man, great to see you back into AI.

  • @zacharybamberger6965 says:

    I’m sorry to be “that negative guy” in the comments, but some of your claims here are overstretched, and the concepts you’re throwing around are at a surface level. You made little reference to the importance of reward models in PPO and did not distinguish between per-step and global evaluation (a critical aspect of creating the tree structure you made reference to). There’s also no evidence that reasoning models require special tokens. Finally, the applicability of your method here is super constrained, whereas other MCTS-based methods with language models manage to generalize to non-math based tasks.

    You’ve produced excellent videos in the past, but this one unfortunately falls short

  • @sapandeepsandhu4410 says:

    back to track great

  • @competidor64 says:

    Thanks Siraj

  • @robbybobby6464 says:

    Havent seen or kept up with your channel in a long while but glad to see you’re still creating awesome and well-explained content!

  • @mootytootyfrooty says:

    Ha this is great, I was thinking a few days ago about what would happen if we used o1 to document itself and the paper chain, and you went and did it!

  • @swapnil6996 says:

    Cheater.

  • @arthur...barros says:

    mind blowing

  • @Stan_144 says:

    Siraj rocks ..

  • >