GPT 4.5 – not so much wow

GPT 4.5 is here, and do you remember when AI lab CEOs like Sam Altman and Dario Amodei were betting everything on scaling up base models like this one? Well let’s find out what would have happened if the future of AI rested on models like GPT 4.5. You’ll see all the benchmarks, highlights of the paper, emotional intelligence and humor tests, Simple Bench results (reddit was an unreliable source), and why it’s not all bad news for OpenAI.

AI Insiders (now $9!):

Chapters
00:00 – Introduction
01:04 – Details and Benchmarks
03:04 – Emotional intelligence?
08:37 – Creative writing?
11:40 – Visual reasoning and Pricing
12:41 – Simple Performance
16:01 – End of Pretraining Scaling?
17:03 – CEO Hype
18:11 – System Card Highlights
23:32 – Karpathy Reaction

GPT 4.5 System card:
Release Notes:
Altman Hype:
Details:
End of an Era:
Anthropic Original Claim:
Smell:
Bob McGrew:
Deep Research System Card:
Reddit:
API Pricing:
LiveStream:

Karpathy Comparison:

Non-hype Newsletter:

Podcast:

Joe Lilli
 

  • @farhadkarimi says:

    Been waiting to hear your thoughts on this…

  • @jit-r5b says:

    It’s insane how many times I refresh youtube checking for this video… guys, we’re putting too much pressure on this gentleman! 😩

  • @londonl.5892 says:

    I love the direct comparison of EQ between ChatGPT 4.5 and Claude. Great video, as per always!

  • @Mazzwar says:

    “Is your new AI smart?” “Uh, no, um, but he’s got a great personality?”

  • @asi_karel says:

    AGI 20 years away again

  • @hunkaar31 says:

    Wow… I feel so disappointed. Claude is in a completely different dimension in terms of EQ.

    • @maniksahdev4292 says:

      Yea I’ve been heavy Claude and Grok user, I couldn’t understand the hype around gpt 4.5

      It just seems mid and not even sure why this product exists? Maybe Sam Altman is not releasing product through his lens of inferiority, same thing he complained that Elon was doing.

  • @jackfarris3670 says:

    That fact that Claude can detect you’re testing it is insane to me

    • @timseguine2 says:

      FYI: this is something AI safety advocates predicted would happen eventually.

    • @RomeTWguy says:

      Pretty sure its latest training dataset has a good amount of these types of test prompts, so the amswer it generates is not very surprising

    • @shApYT says:

      It’s just a reflection of its good training dataset

    • @jackfarris3670 says:

      @@RomeTWguy “oh this teleporter is designed to teleport people, so it’s not that surprising”
      The surprising part isn’t that it’s “out of the blue” the surprising part is that it’s something able to be done.

    • @JMORG-q9y says:

      I wonder how ai would think and world construct if it was only trained on preinternet data.

  • @javiercmh says:

    Listened to the official video and closed it shortly after. I prefer your videos 100 times 😅

  • @dextrodus says:

    This feels like an explanation of why it’s been so long since gpt 4 – without the o series, openai would not be taken very seriously with claude 3.5 and now even 3.7 being so much better at so many things

  • @Jonnyw23 says:

    They said it wouldn’t crush benchmarks, but the whole “GPT-4.5 give me a “Feel The AGI” moment” from Altman, in my opinion, is just ridiculous lol

  • @zickiwow says:

    “uses scissors to draw blood from my toes.” LMAO😅😂😭

  • @tobiasjennerjahn8659 says:

    Claiming that a model with strong sycophantic tendencies is actually demonstrating amazing EQ is the most tech-bro opinion I’ve seen in a while, lol.

  • @BlakeEM says:

    Anthropic puts a lot of work into the safety of their models and it shows. I think their system prompt may also be helping with these tests, because they say it has emotions so it gives more human responses. Open AI only released GPT 4.5 to try to drown out news of Claude 3.7. They do this every time.

  • @s-hedlund says:

    This is embarrassing for OpenAI

  • @gamblerofrats says:

    The release of 4.5 is bascially free promo for Anthropic

  • @sam6000 says:

    This is an absolutely amazing video! Almost every other video about GPT-4.5 has been railing on the coding and math performance, even though they said it wasn’t going to be anything crazy, and just ignore the EQ and long term play. I also really wanted to see this comparison, because Claude has beenreally great at writing, EQ and roleplay. Testing the things that OpenAI claimed about the model is definitely the way to go.

  • @sagetmaster4 says:

    Claude 3.7 is making me feel the AGI. Friendship ended with Sam Altman. Dario is my new best friend

  • @TranquilMarmot says:

    CEOs lying to hype up the product that they’re selling?!?! Say it ain’t so.

  • @MFsyrup says:

    You should interview Pliny the Liberator. Best jail breaker of all time

  • @dylancook3282 says:

    the best part about GPT4.5 is that it’s a great reason to switch to claude.

  • >