o1 – What is Going On? Why o1 is a 3rd Paradigm of Model + 10 Things You Might Not Know

o1 is different, and even sceptics are calling it a 'large reasoning model'. But why is it so different, and why does that say about the future? When models are rewarded for correctness of answers, not just harmlessness or predicting the next word. But does even o1 lack spatial reasoning? How did the White House react yesterday? And did Ilya Sutskever warn of o1 getting … 'creative'?

AI Insiders – Now $9:

Chapters:
00:00 – Intro
01:04 – How o1 Works (The 3rd Paradigm)
03:10 – We Don’t Need Human Examples (OpenAI)
03:54 – How o1 Works (Temp 1 Graded)
06:28 – Is This Reasoning?
08:48 – Personal Announcement
11:27 – Hidden, serial Thoughts?
13:11 – Memorized Reasoning?
15:40 – 10 Facts

o1, Learning to Reason –

Species Tweet:

Noam Brown Video:

2021 Paper on Verifiers:

Let’s Verify Step By Step:

DeepMind Not Far Behind:

Chain of Thought for Serial Problems:

Q* Clues (yes, I am proud of that one):

Let’s Think Pixel by Pixel:

Or Dot by Dot (the general power of CoT):

RL by Karpathy:

Not Prompt Engineering:

ARC-AGI Analysis:

Reality Foundation models:

Memorising Reasoning vs CoT Q-Table:

When You Know the Right CoT:

Original Information Report:

StockFish: (chess)

Will AGI fall like chess:

Fei Fei Start-up $1B:

White House Report:

Simple-Bench:

o1 Fails:

My New Coursera Course! The 8 Most Controversial Terms in AI:

Non-hype Newsletter:

GenAI Hourly Consulting:

I use Descript to edit my videos:

Many people expense AI Insiders for work. Feel free to use this template:

AI Insiders – Now $9:

@Asmodeus.q says:

September 18, 2024 at 9:36 pm

Here for the singularity

@bolermanii says:

babe wake up, the best AI channel posted again

@allanshpeley4284 says:

September 18, 2024 at 9:39 pm

Yeah, you’re definitely single.

@danielhenderson7050 says:

September 18, 2024 at 9:42 pm

Hey babe, I’m awake and was already watching no worries

@user-nz6vw9fj1h says:

September 18, 2024 at 9:43 pm

Yes honey

@verlax8956 says:

September 18, 2024 at 9:44 pm

First off i’m not your babe. I am a bearded man in my 50s who has just worked out at Planet Fitness.

@DeepThinker193 says:

September 18, 2024 at 10:14 pm

Hey diddle diddle.

@Citrusautomaton says:

September 18, 2024 at 9:44 pm

Some people are honestly just impatient. They think that because we haven’t got AGI now, it’s all over and we aren’t ever gonna get it. Have patience, folks, because i get the feeling that the next year will really wow you all.

@jsrjsr says:

September 18, 2024 at 9:55 pm

Cope. It is here or it isn’t. Choose one or be a wordcel.

@LuigiSimoncini says:

September 18, 2024 at 9:58 pm

Make it ten years… maybe

@CitySeventeen17 says:

September 18, 2024 at 10:02 pm

y’all forgot that nothing ever happens

@somechrisguy says:

September 18, 2024 at 10:16 pm

said that last year

@Bezimienny1598 says:

September 18, 2024 at 10:18 pm

It’s because while our models are getting better and better on an almost monthly basis there’s still very little commercial applications for them. Once we solve the hallucination issue I think that’s when things will explode. Researching or just searching for any piece of information will be SO easy and companies will do so many things with these models.

@ackmonra says:

September 18, 2024 at 10:13 pm

The concept of ‘dreaming up’ some really interesting and creative solution (high temp) followed by harshly cutting those down to what is reasonable and realistic and plausible (low temp) feels very left brain / right brain in humans. Interesting.

@jacobbrassard2776 says:

September 18, 2024 at 10:33 pm

It would be cool to have two different models working together to coordinate…

@mathufnn says:

September 18, 2024 at 10:35 pm

Wait is that kinda how the brain reasons? Do you have any references? That’s so cool
Every day, LLMs get more and more similar to brains (very inefficient ones)

@evananderson983 says:

September 18, 2024 at 10:38 pm

See “thinking fast and slow” by Kahneman. Intuition vs reasoning

@stephen-torrence says:

September 18, 2024 at 10:41 pm

Underrated comment. There’s a reason animal brains (not just humans) have this collaborative bifurcation.

@heywrandom8924 says:

September 18, 2024 at 11:06 pm

@@mathufnnThe comment below maybe tried to answer you there is also a video by veritassium on thinking fast and slow

@prole1917 says:

September 18, 2024 at 10:14 pm

The key to getting AI to go beyond just being a very good librarian pulling existing information to actually coming up with novel ideas and innovating is going to be enabling agentic behavior and autonomous experimentation

@Neomadra says:

September 18, 2024 at 10:15 pm

FYI, Claude gets the block stacking problem right, too.

Let’s think through this step-by-step:

1. Initial state:
C
A
–
B (on the table)

2. We need to get to this state:
A
B
C

3. The key constraint is that we cannot move block C.

4. Given these conditions, it’s actually impossible to achieve the desired stack without moving block C. Here’s why:

– Block C must remain at the bottom of the stack since we can’t move it.
– Block B needs to be placed on top of C.
– Block A needs to end up on top of B.

5. However, block A is currently under C, and we’re not allowed to move C to free A.

Therefore, it’s not possible to arrange the blocks in the requested order (A on B on C) without moving block C. The constraint of not moving C prevents any solution to this problem.

@mathufnn says:

September 18, 2024 at 10:36 pm

Man, Claude seems to have really good natural reasoning. Makes me curious about how it will fare with chain of thought learning

@panzerofthelake4460 says:

September 18, 2024 at 11:31 pm

Claude has a stronger “intuition”, so to say. @@mathufnn

@VeganCheeseburger says:

September 19, 2024 at 1:31 am

Yeah, I just got the right answer too with 3.5 Sonnet but it took a couple tries of telling it it was wrong

@Neomadra says:

September 19, 2024 at 6:32 am

@@VeganCheeseburger For me it got it right first time without any help.

@ain92ru says:

September 19, 2024 at 2:11 pm

I’m pretty sure Subbarao Kambhampati, Noam Brown, Philip and all the LLMs are wrong. There is indeed a way to rearrange the block if you don’t just follow the reasoning you memorized and use your human superpower of lateral thinking: fix block C in place by an external tool fixed to the floor/ceiling/wall/whatever, shorten all legs of the table for the height of block B and fit the latter back into the stack. Easy-peasy!

@joshuaspeckman7075 says:

September 18, 2024 at 10:16 pm

Note that the figure at 6:10 has a log scale on the x-axis. This is logarithmic growth in pass@1 accuracy, not linear growth.

@sickandtired6156 says:

September 18, 2024 at 10:48 pm

Exactly was about to comment

@panzerofthelake4460 says:

September 18, 2024 at 11:32 pm

So it’s diminishing returns, huh?

@user93237 says:

September 19, 2024 at 2:00 am

This is to be expected because growth in accuracy becomes exponentially hard with higher accuracy as for each item there is a chance of the model not knowing the correct answer. This corresponds to a conjunction of many events with probability <1 resulting in a product that tends toward zero probability. Accuracy, hence, does not measure capabilities very well in the lower and upper ranges. I think you would actually have to transform the y-axis by an inverse sigmoid to get something more intuitive. However, accuracy overall can of course be a misleading measure still due to things like data leakage, overfitting, ceiling effect, lacking generalization to other datasets and so on.

@mambaASI says:

September 19, 2024 at 2:06 am

@@panzerofthelake4460 not necessarily. The change in spacing over the log scale time for compute, is not consistent with increasing time for compute. So the next plot point could end up significantly higher than the previous plot point, given more time for compute. Won’t know until its executed

@TobiasWeg says:

September 19, 2024 at 9:49 am

I removed my comment making the same point, it’s good point and you are right. OpenAI sentence, below the graph, is correct, but misleading.

@Pizzarrow says:

September 18, 2024 at 10:25 pm

Thank you for reducing the price! I’m one of those people that has been holding out for this to happen. I even browsed through your patreon a couple days ago and wished I could sign up

@autocatalyst says:

September 18, 2024 at 11:35 pm

Same, just signed up!

@aiexplained-official says:

September 19, 2024 at 5:16 pm

@@autocatalyst Thank you both so much

@Gen-XJohnny says:

September 18, 2024 at 10:28 pm

O1 helped me code something in my Unreal Engine project that was previously impossible. Now, I can create anything and run it in my game—I’m blown away.

@jonschlinkert says:

September 18, 2024 at 10:57 pm

huh, my experience with o1 is that it just takes longer to get the wrong answer.

@JohnLewis-old says:

September 18, 2024 at 10:58 pm

What was your workflow to do this?

@cdmichaelb says:

September 18, 2024 at 11:06 pm

@@jonschlinkert I’ve had a great experience with o1. Do you mean in general, or specific to programming?

@JoeARedHawk275 says:

September 18, 2024 at 11:09 pm

@@jonschlinkertIt depends on what you’re asking it. It’s better for some coding and reasoning problems. However, it’s not as good in some other aspects

@krause79 says:

September 18, 2024 at 11:17 pm

It’s the preview, the real deal will come later this year.

@TheElkadeoWay says:

September 18, 2024 at 10:51 pm

Imagine putting on a blindfold and listening to someone describe their kitchen to you in great detail. With enough detail you could probably figure out how to navigate their kitchen with your blindfold still on … but you’d probably also make some weird mistakes as well — this is the state of AI.
“Hallucinating” in AI is like having been told during training that “kids drawings are often posted to the fridge” and so you reach up to look for a drawing there but don’t find one. Your training told you to expect it, but you experience a mismatch between training and reality.

@cdmichaelb says:

September 18, 2024 at 11:05 pm

In o1-preview I asked it to decode gibberish… it spent 127 seconds trying to do it, in its output it didn’t say “I don’t know” flat out, but it didn’t give a wrong answer. Instead, it explained information that it used instead.

@CellarDoorCS says:

September 18, 2024 at 11:44 pm

I used Yan LeCuns current llm bench question – walking along the sphere. It failed

@r-saint says:

September 18, 2024 at 11:59 pm

@@CellarDoorCS Why people are so desperate to beat all the benchmarks? While there are some not beaten, we’re still not in Terminator timeline 😀 You should be happy.

@ThreeChe says:

September 19, 2024 at 12:52 am

@@r-saint No Terminators until we get integrated spatial reasoning. Although 1 billion raised in 4 months for that startup gives me a feeling it’ll come sooner rather than later.

@StylishHobo says:

September 19, 2024 at 1:06 am

No, it explained the information it likely could have used. It has no idea what it actually did. It’s all just mimicry.

@alansmithee419 says:

September 19, 2024 at 1:40 am

@@CellarDoorCS
I have so far not been happy with the fact that pretty much everyone using that test seems to be completely unaware of how straight lines work on spheres. That said, I also haven’t found an AI model that gives a good answer. But the problem isn’t limited to AI models, most humans can’t answer it either, or more often answer it confidentially incorrectly. So I wouldn’t use a model’s failure to answer it in an attempt to demonstrate that it’s any lesser than humans.

To make this quick as I can (which, it’s me, so that’s probably not gonna be very quick):

A straight line on a sphere is not a line of latitude. People often look at the question as meaning you walk some way around the earth’s curvature south, and then because you turn to face east for example and start walking in a straight line they assume that means you will continue to travel exactly east on a line of latitude, a constant distance away from the pole. That is not the case. To see why, imagine only walking one meter away from the north pole in a physical space (like literally imagine yourself standing at the north pole and walking away 1 meter). Then turn 90 degrees left, and walk in a straight line. The assumption made in most people’s attempts implies that you will now walk in a 1 meter radius circle (a line of latitude with very close to a 90 degree north angle) around the north pole. That is clearly not a straight line as the question requires, as you are actively turning the entire time. Unless you walk all the way to the equator, you cannot think of the problem like this.

Straight lines on spheres are called great circles. On earth, they loop all away around the planet in a circle and have the core of the earth at their centre. All of them have an approximate length of 40,000km and radius roughly equal to the standard radius of the earth (varies slightly depending on your starting distance from the equator and direction).

@Walczyk says:

September 18, 2024 at 11:18 pm

i used o1 preview on graduate physics and math problems and it was stunning. so stunning. i have chills. it spent 143 seconds

@theWACKIIRAQI says:

September 18, 2024 at 11:29 pm

How long do you usually take?

@Walczyk says:

September 19, 2024 at 1:00 am

@@theWACKIIRAQI4 hours per problem

@ns3938 says:

September 19, 2024 at 2:01 am

We’re doomed

@madalinradion says:

September 19, 2024 at 2:27 am

If only it could answer problems with no current solutions

@AlfarrisiMuammar says:

September 19, 2024 at 2:28 am

@@Walczykand They say LLM are not calculators.SO LLM It will not be possible to calculate accurately 😂

@fletchermclaughlin8971 says:

September 18, 2024 at 11:33 pm

You are the best, more informative AI channel by far. I remember your Q* video that predicted many of the developments that took place with o1, before anyone! You don’t merely consolidate AI news like so many other channels, you thoroughly investigate the latest research, all so the general public can understand what is going on at the frontier of AI in simple terms.

The price you are charging is more than fair, and we appreciate your content very much. Thank you.

@Loris-- says:

September 18, 2024 at 11:36 pm

When a channel is so much in a league of its own that the only guest who can match the quality is your past self.

@francisco444 says:

September 19, 2024 at 1:39 am

AI Explained training the next AI Explained model

@anisingh5437 says:

September 19, 2024 at 8:53 am

True.

@endi2145 says:

September 19, 2024 at 1:10 pm

Bro should be opening his own lab

@timwang4659 says:

September 19, 2024 at 12:22 am

Gary Marcus thinks this is just clever brute force, “pushing the limits of dead end approach”, but does it matter when this approach gets better over time and hasn’t shown signs of stopping?

@ThreeChe says:

September 19, 2024 at 12:53 am

Like Altman said, if you strap a rocket to a stochastic parrot you can still reach the moon.

@timwang4659 says:

September 19, 2024 at 1:04 am

@@ThreeChe who knows, maybe this dumpster rocket will be creative enough to generate us a more efficient approach to AGI? totally possible 🤷🏻

@Umarudon says:

September 19, 2024 at 1:20 am

@@timwang4659Exactly! lmao

@mambaASI says:

September 19, 2024 at 2:21 am

@@timwang4659 I can see it being used to do nonstop AI research autonomosly, definitely possible it comes up with novel algos, architectures, etc that are more efficient. But I also don’t see why this current approach can’t produce an AGI model (likely mixture of experts). Performance improvements with compute scaling is not slowing down. The money is there to continue scaling. Maybe energy constraints will get in the way before it happens, maybe not.

@HAL-zl1lg says:

September 19, 2024 at 9:50 am

I mean, human intelligence isn’t merely brute force but I think it’s clear there is an aspect of brute force to it. The advancement of many fields seems to be contingent on trialing different things until something sticks.

@VeganCheeseburger says:

September 19, 2024 at 1:14 am

AI grifters and clickbaiters: watch this video, study it, and improve your game. This is what AI content on YouTube should look like.

@VeganCheeseburger says:

September 19, 2024 at 1:29 am

@@faizanrana2998please start making sense. Are you a bot, or just dumb?

@TheSCBGeneral says:

September 19, 2024 at 12:36 pm

They don’t care about AI as much as they care about ad revenue. They’ll produce clickbaity slop just to keep the gravy train rolling.

@burninator9000 says:

September 19, 2024 at 2:47 am

If Ilya ever gets tired of making ai, he could get a role in movies giving super subtle but incredibly ominous and convincing predictions about various future scenarios. “Unexpected creativity that makes the antics of Sidney look very modest”. 😮

@Hexanitrobenzene says:

September 19, 2024 at 5:03 am

Yeah, and his manner of speech is very distinctive – perfect for ominous statements 🙂

@alpha007org says:

September 19, 2024 at 5:44 pm

True. He would be the best “mad/genius AI scientist.”

@K.F-R says:

September 19, 2024 at 8:03 am

Saying “I don’t know” is a challenge for most humans, too.

@NikiDrozdowski says:

September 19, 2024 at 9:14 am

Now I have the very strong feeling that “RL is creative” combined with “effective CoT is not English anymore” is actually what Ilya saw

@mariokotlar303 says:

September 19, 2024 at 12:24 pm

As one of the people who complained about the price for AI Insiders previously being too high, I now stand by my word and have subbed for a year!

@aiexplained-official says:

September 19, 2024 at 12:36 pm

Thanks so much Mario!!

o1 – What is Going On? Why o1 is a 3rd Paradigm of Model + 10 Things You Might Not Know

Related Posts

Joe Lilli