• Home
  • AI

o1 – What is Going On? Why o1 is a 3rd Paradigm of Model + 10 Things You Might Not Know

o1 is different, and even sceptics are calling it a 'large reasoning model'. But why is it so different, and why does that say about the future? When models are rewarded for correctness of answers, not just harmlessness or predicting the next word. But does even o1 lack spatial reasoning? How did the White House react yesterday? And did Ilya Sutskever warn of o1 getting … 'creative'?

AI Insiders – Now $9:

Chapters:
00:00 – Intro
01:04 – How o1 Works (The 3rd Paradigm)
03:10 – We Don’t Need Human Examples (OpenAI)
03:54 – How o1 Works (Temp 1 Graded)
06:28 – Is This Reasoning?
08:48 – Personal Announcement
11:27 – Hidden, serial Thoughts?
13:11 – Memorized Reasoning?
15:40 – 10 Facts

o1, Learning to Reason –

Species Tweet:

Noam Brown Video:

2021 Paper on Verifiers:

Let’s Verify Step By Step:

DeepMind Not Far Behind:

Chain of Thought for Serial Problems:

Q* Clues (yes, I am proud of that one):

Let’s Think Pixel by Pixel:

Or Dot by Dot (the general power of CoT):

RL by Karpathy:

Not Prompt Engineering:

ARC-AGI Analysis:

Reality Foundation models:

Memorising Reasoning vs CoT Q-Table:

When You Know the Right CoT:

Original Information Report:

StockFish: (chess)

Will AGI fall like chess:

Fei Fei Start-up $1B:

White House Report:

Simple-Bench:

o1 Fails:

My New Coursera Course! The 8 Most Controversial Terms in AI:

Non-hype Newsletter:

GenAI Hourly Consulting:

I use Descript to edit my videos:

Many people expense AI Insiders for work. Feel free to use this template:

AI Insiders – Now $9:

Joe Lilli
 

  • @Asmodeus.q says:

    Here for the singularity

  • @bolermanii says:

    babe wake up, the best AI channel posted again

  • @Citrusautomaton says:

    Some people are honestly just impatient. They think that because we haven’t got AGI now, it’s all over and we aren’t ever gonna get it. Have patience, folks, because i get the feeling that the next year will really wow you all.

  • @ackmonra says:

    The concept of ‘dreaming up’ some really interesting and creative solution (high temp) followed by harshly cutting those down to what is reasonable and realistic and plausible (low temp) feels very left brain / right brain in humans. Interesting.

  • @prole1917 says:

    The key to getting AI to go beyond just being a very good librarian pulling existing information to actually coming up with novel ideas and innovating is going to be enabling agentic behavior and autonomous experimentation

  • @Neomadra says:

    FYI, Claude gets the block stacking problem right, too.

    Let’s think through this step-by-step:

    1. Initial state:
    C
    A

    B (on the table)

    2. We need to get to this state:
    A
    B
    C

    3. The key constraint is that we cannot move block C.

    4. Given these conditions, it’s actually impossible to achieve the desired stack without moving block C. Here’s why:

    – Block C must remain at the bottom of the stack since we can’t move it.
    – Block B needs to be placed on top of C.
    – Block A needs to end up on top of B.

    5. However, block A is currently under C, and we’re not allowed to move C to free A.

    Therefore, it’s not possible to arrange the blocks in the requested order (A on B on C) without moving block C. The constraint of not moving C prevents any solution to this problem.

    • @mathufnn says:

      Man, Claude seems to have really good natural reasoning. Makes me curious about how it will fare with chain of thought learning

    • @panzerofthelake4460 says:

      Claude has a stronger “intuition”​, so to say. @@mathufnn

    • @VeganCheeseburger says:

      Yeah, I just got the right answer too with 3.5 Sonnet but it took a couple tries of telling it it was wrong

    • @Neomadra says:

      ​@@VeganCheeseburger For me it got it right first time without any help.

    • @ain92ru says:

      I’m pretty sure Subbarao Kambhampati, Noam Brown, Philip and all the LLMs are wrong. There is indeed a way to rearrange the block if you don’t just follow the reasoning you memorized and use your human superpower of lateral thinking: fix block C in place by an external tool fixed to the floor/ceiling/wall/whatever, shorten all legs of the table for the height of block B and fit the latter back into the stack. Easy-peasy!

  • @joshuaspeckman7075 says:

    Note that the figure at 6:10 has a log scale on the x-axis. This is logarithmic growth in pass@1 accuracy, not linear growth.

    • @sickandtired6156 says:

      Exactly was about to comment

    • @panzerofthelake4460 says:

      So it’s diminishing returns, huh?

    • @user93237 says:

      This is to be expected because growth in accuracy becomes exponentially hard with higher accuracy as for each item there is a chance of the model not knowing the correct answer. This corresponds to a conjunction of many events with probability <1 resulting in a product that tends toward zero probability. Accuracy, hence, does not measure capabilities very well in the lower and upper ranges. I think you would actually have to transform the y-axis by an inverse sigmoid to get something more intuitive. However, accuracy overall can of course be a misleading measure still due to things like data leakage, overfitting, ceiling effect, lacking generalization to other datasets and so on.

    • @mambaASI says:

      @@panzerofthelake4460 not necessarily. The change in spacing over the log scale time for compute, is not consistent with increasing time for compute. So the next plot point could end up significantly higher than the previous plot point, given more time for compute. Won’t know until its executed

    • @TobiasWeg says:

      I removed my comment making the same point, it’s good point and you are right. OpenAI sentence, below the graph, is correct, but misleading.

  • @Pizzarrow says:

    Thank you for reducing the price! I’m one of those people that has been holding out for this to happen. I even browsed through your patreon a couple days ago and wished I could sign up

  • @Gen-XJohnny says:

    O1 helped me code something in my Unreal Engine project that was previously impossible. Now, I can create anything and run it in my game—I’m blown away.

  • @TheElkadeoWay says:

    Imagine putting on a blindfold and listening to someone describe their kitchen to you in great detail. With enough detail you could probably figure out how to navigate their kitchen with your blindfold still on … but you’d probably also make some weird mistakes as well — this is the state of AI.
    “Hallucinating” in AI is like having been told during training that “kids drawings are often posted to the fridge” and so you reach up to look for a drawing there but don’t find one. Your training told you to expect it, but you experience a mismatch between training and reality.

  • @cdmichaelb says:

    In o1-preview I asked it to decode gibberish… it spent 127 seconds trying to do it, in its output it didn’t say “I don’t know” flat out, but it didn’t give a wrong answer. Instead, it explained information that it used instead.

    • @CellarDoorCS says:

      I used Yan LeCuns current llm bench question – walking along the sphere. It failed

    • @r-saint says:

      @@CellarDoorCS Why people are so desperate to beat all the benchmarks? While there are some not beaten, we’re still not in Terminator timeline 😀 You should be happy.

    • @ThreeChe says:

      @@r-saint No Terminators until we get integrated spatial reasoning. Although 1 billion raised in 4 months for that startup gives me a feeling it’ll come sooner rather than later.

    • @StylishHobo says:

      No, it explained the information it likely could have used. It has no idea what it actually did. It’s all just mimicry.

    • @alansmithee419 says:

      @@CellarDoorCS
      I have so far not been happy with the fact that pretty much everyone using that test seems to be completely unaware of how straight lines work on spheres. That said, I also haven’t found an AI model that gives a good answer. But the problem isn’t limited to AI models, most humans can’t answer it either, or more often answer it confidentially incorrectly. So I wouldn’t use a model’s failure to answer it in an attempt to demonstrate that it’s any lesser than humans.

      To make this quick as I can (which, it’s me, so that’s probably not gonna be very quick):

      A straight line on a sphere is not a line of latitude. People often look at the question as meaning you walk some way around the earth’s curvature south, and then because you turn to face east for example and start walking in a straight line they assume that means you will continue to travel exactly east on a line of latitude, a constant distance away from the pole. That is not the case. To see why, imagine only walking one meter away from the north pole in a physical space (like literally imagine yourself standing at the north pole and walking away 1 meter). Then turn 90 degrees left, and walk in a straight line. The assumption made in most people’s attempts implies that you will now walk in a 1 meter radius circle (a line of latitude with very close to a 90 degree north angle) around the north pole. That is clearly not a straight line as the question requires, as you are actively turning the entire time. Unless you walk all the way to the equator, you cannot think of the problem like this.

      Straight lines on spheres are called great circles. On earth, they loop all away around the planet in a circle and have the core of the earth at their centre. All of them have an approximate length of 40,000km and radius roughly equal to the standard radius of the earth (varies slightly depending on your starting distance from the equator and direction).

  • @Walczyk says:

    i used o1 preview on graduate physics and math problems and it was stunning. so stunning. i have chills. it spent 143 seconds

  • @fletchermclaughlin8971 says:

    You are the best, more informative AI channel by far. I remember your Q* video that predicted many of the developments that took place with o1, before anyone! You don’t merely consolidate AI news like so many other channels, you thoroughly investigate the latest research, all so the general public can understand what is going on at the frontier of AI in simple terms.

    The price you are charging is more than fair, and we appreciate your content very much. Thank you.

  • @Loris-- says:

    When a channel is so much in a league of its own that the only guest who can match the quality is your past self.

  • @timwang4659 says:

    Gary Marcus thinks this is just clever brute force, “pushing the limits of dead end approach”, but does it matter when this approach gets better over time and hasn’t shown signs of stopping?

    • @ThreeChe says:

      Like Altman said, if you strap a rocket to a stochastic parrot you can still reach the moon.

    • @timwang4659 says:

      @@ThreeChe who knows, maybe this dumpster rocket will be creative enough to generate us a more efficient approach to AGI? totally possible 🤷🏻

    • @Umarudon says:

      ​@@timwang4659Exactly! lmao

    • @mambaASI says:

      @@timwang4659 I can see it being used to do nonstop AI research autonomosly, definitely possible it comes up with novel algos, architectures, etc that are more efficient. But I also don’t see why this current approach can’t produce an AGI model (likely mixture of experts). Performance improvements with compute scaling is not slowing down. The money is there to continue scaling. Maybe energy constraints will get in the way before it happens, maybe not.

    • @HAL-zl1lg says:

      I mean, human intelligence isn’t merely brute force but I think it’s clear there is an aspect of brute force to it. The advancement of many fields seems to be contingent on trialing different things until something sticks.

  • @VeganCheeseburger says:

    AI grifters and clickbaiters: watch this video, study it, and improve your game. This is what AI content on YouTube should look like.

  • @burninator9000 says:

    If Ilya ever gets tired of making ai, he could get a role in movies giving super subtle but incredibly ominous and convincing predictions about various future scenarios. “Unexpected creativity that makes the antics of Sidney look very modest”. 😮

  • @K.F-R says:

    Saying “I don’t know” is a challenge for most humans, too.

  • @NikiDrozdowski says:

    Now I have the very strong feeling that “RL is creative” combined with “effective CoT is not English anymore” is actually what Ilya saw

  • @mariokotlar303 says:

    As one of the people who complained about the price for AI Insiders previously being too high, I now stand by my word and have subbed for a year!

  • >