GPT 4.5 – not so much wow

GPT 4.5 is here, and do you remember when AI lab CEOs like Sam Altman and Dario Amodei were betting everything on scaling up base models like this one? Well let’s find out what would have happened if the future of AI rested on models like GPT 4.5. You’ll see all the benchmarks, highlights of the paper, emotional intelligence and humor tests, Simple Bench results (reddit was an unreliable source), and why it’s not all bad news for OpenAI.

AI Insiders (now $9!):

Chapters
00:00 – Introduction
01:04 – Details and Benchmarks
03:04 – Emotional intelligence?
08:37 – Creative writing?
11:40 – Visual reasoning and Pricing
12:41 – Simple Performance
16:01 – End of Pretraining Scaling?
17:03 – CEO Hype
18:11 – System Card Highlights
23:32 – Karpathy Reaction

GPT 4.5 System card:
Release Notes:
Altman Hype:
Details:
End of an Era:
Anthropic Original Claim:
Smell:
Bob McGrew:
Deep Research System Card:
Reddit:
API Pricing:
LiveStream:

Karpathy Comparison:

Non-hype Newsletter:

Podcast:

@farhadkarimi says:

February 28, 2025 at 4:22 pm

Been waiting to hear your thoughts on this…

@jit-r5b says:

February 28, 2025 at 4:23 pm

It’s insane how many times I refresh youtube checking for this video… guys, we’re putting too much pressure on this gentleman! 😩

@aiexplained-official says:

February 28, 2025 at 4:33 pm

Haha no worries

@nashh600 says:

February 28, 2025 at 5:00 pm

@@aiexplained-official 11:40 what exactly was it supposed to say?

@londonl.5892 says:

February 28, 2025 at 4:27 pm

I love the direct comparison of EQ between ChatGPT 4.5 and Claude. Great video, as per always!

@mantas9827 says:

February 28, 2025 at 5:30 pm

right? such great comparisons. I wanted to try 4.5 via the API but won’t bother now. Thanks

@Mazzwar says:

February 28, 2025 at 4:29 pm

“Is your new AI smart?” “Uh, no, um, but he’s got a great personality?”

@zelaird8526 says:

February 28, 2025 at 4:40 pm

“Look, the smart AI goes to a different school. That’s why he’s not here right now.”

@rmiddlehouse says:

February 28, 2025 at 5:49 pm

Well you have to keep in mind that this isn’t one of those “thinking” AIs

@asi_karel says:

February 28, 2025 at 4:30 pm

AGI 20 years away again

@iraklimgeladze5223 says:

February 28, 2025 at 4:50 pm

We could lose jobs before AGI comes

@wrathofgrothendieck says:

February 28, 2025 at 4:56 pm

In a millennia, when they finally reach AGI…

@ticketforlife2103 says:

February 28, 2025 at 5:09 pm

AGI will not be LLMs. Llms will be part of it.

@apricotmadness4850 says:

February 28, 2025 at 5:39 pm

@@ticketforlife2103Definitely or something similar to them.

@CasualTortoise says:

February 28, 2025 at 5:45 pm

It’s the new fusion.

@hunkaar31 says:

February 28, 2025 at 4:32 pm

Wow… I feel so disappointed. Claude is in a completely different dimension in terms of EQ.

@maniksahdev4292 says:

February 28, 2025 at 5:45 pm

Yea I’ve been heavy Claude and Grok user, I couldn’t understand the hype around gpt 4.5

It just seems mid and not even sure why this product exists? Maybe Sam Altman is not releasing product through his lens of inferiority, same thing he complained that Elon was doing.

@jackfarris3670 says:

February 28, 2025 at 4:34 pm

That fact that Claude can detect you’re testing it is insane to me

@timseguine2 says:

February 28, 2025 at 4:47 pm

FYI: this is something AI safety advocates predicted would happen eventually.

@RomeTWguy says:

February 28, 2025 at 5:01 pm

Pretty sure its latest training dataset has a good amount of these types of test prompts, so the amswer it generates is not very surprising

@shApYT says:

February 28, 2025 at 5:08 pm

It’s just a reflection of its good training dataset

@jackfarris3670 says:

February 28, 2025 at 5:13 pm

@@RomeTWguy “oh this teleporter is designed to teleport people, so it’s not that surprising”
The surprising part isn’t that it’s “out of the blue” the surprising part is that it’s something able to be done.

@JMORG-q9y says:

February 28, 2025 at 5:14 pm

I wonder how ai would think and world construct if it was only trained on preinternet data.

@javiercmh says:

February 28, 2025 at 4:37 pm

Listened to the official video and closed it shortly after. I prefer your videos 100 times 😅

@dextrodus says:

February 28, 2025 at 4:39 pm

This feels like an explanation of why it’s been so long since gpt 4 – without the o series, openai would not be taken very seriously with claude 3.5 and now even 3.7 being so much better at so many things

@joeansell7106 says:

February 28, 2025 at 5:27 pm

Noam Brown specifically may have saved OpenAI’s fate

@Jonnyw23 says:

February 28, 2025 at 4:46 pm

They said it wouldn’t crush benchmarks, but the whole “GPT-4.5 give me a “Feel The AGI” moment” from Altman, in my opinion, is just ridiculous lol

@zickiwow says:

February 28, 2025 at 4:49 pm

“uses scissors to draw blood from my toes.” LMAO😅😂😭

@xviii5780 says:

February 28, 2025 at 5:36 pm

Maybe she’s European

@tobiasjennerjahn8659 says:

February 28, 2025 at 5:00 pm

Claiming that a model with strong sycophantic tendencies is actually demonstrating amazing EQ is the most tech-bro opinion I’ve seen in a while, lol.

@BlakeEM says:

February 28, 2025 at 5:01 pm

Anthropic puts a lot of work into the safety of their models and it shows. I think their system prompt may also be helping with these tests, because they say it has emotions so it gives more human responses. Open AI only released GPT 4.5 to try to drown out news of Claude 3.7. They do this every time.

@s-hedlund says:

February 28, 2025 at 5:04 pm

This is embarrassing for OpenAI

@gamblerofrats says:

February 28, 2025 at 5:05 pm

The release of 4.5 is bascially free promo for Anthropic

@sam6000 says:

February 28, 2025 at 5:07 pm

This is an absolutely amazing video! Almost every other video about GPT-4.5 has been railing on the coding and math performance, even though they said it wasn’t going to be anything crazy, and just ignore the EQ and long term play. I also really wanted to see this comparison, because Claude has beenreally great at writing, EQ and roleplay. Testing the things that OpenAI claimed about the model is definitely the way to go.

@sagetmaster4 says:

Claude 3.7 is making me feel the AGI. Friendship ended with Sam Altman. Dario is my new best friend

@TranquilMarmot says:

February 28, 2025 at 5:10 pm

CEOs lying to hype up the product that they’re selling?!?! Say it ain’t so.

@MFsyrup says:

February 28, 2025 at 5:24 pm

You should interview Pliny the Liberator. Best jail breaker of all time

@dylancook3282 says:

February 28, 2025 at 5:33 pm

the best part about GPT4.5 is that it’s a great reason to switch to claude.

GPT 4.5 – not so much wow

Related Posts

Joe Lilli