• Home
  • AI

‘Speaking Dolphin’ to AI Data Dominance, 4.1 + Kling 2.0: 7 Updates Critically Analysed

Giving some context to a hectic week of AI news. This video won’t just be about the release, then, of GPT 4.1, in the last 48 hours, Kling 2.0, a sneak-peak at the next OpenAI model, or even the new Dolphin language tool. It will be about 7 such stories that contextualise where we are in AI and what is happening.

AI Insiders ($9!):

Chapters:
00:00 – Introduction
00:30 – Kling 2.0
01:35 – GPT 4.1
05:25 – o3 Build-up
07:37 – ‘Product Company’
09:31 – Safe Superintelligence
10:54 – DolphinGemma
13:16 – Data Dominance?

Kling 2.0:

Dolphin Gemma:

OpenAI o3 Build-up, The Information:

Physical reasoning:

Fiction Live.bench:

Altman Ted:

4.5:

Geospatial reasoning:

Pioneers:

Evals:

Anthropic Updates:

OpenAI Documentary:

Non-hype Newsletter:

Podcast:

Joe Lilli
 

  • @themonsterintheattic says:

    wasn’t expecting you for at least 4 hours

  • @rakibhasan6218 says:

    Thank you for all of the videos you put on the internet

  • @JohnVance says:

    And now o3 releasing in a couple of hours…so hard to keep up these days!

  • @AbdullahMubashir-Live says:

    Did not expect simple bench having grok 3 at around 8th place

  • @youdontneedmyrealname says:

    The first words the dolphins will say “Goodbye and thanks for all the fish.”

  • @RainbowSixIntel says:

    cannot wait for the o3/o4-mini video! hopefully we have a new king for simplebench.

  • @AndrewSmales says:

    Can’t wait to unlock the secrets of dolphin communication and then realize I live 500km from the closest dolphin and he probably doesn’t want to chat to me anyway.

  • @CleanCereals says:

    I was waiting for your video on GPT4.1! Now you uploaded hours before o3 will be released 😀

    • @aiexplained-official says:

      Haha I know. I got knocked out yesterday but hopefully this vid gives the full context for o3

    • @CleanCereals says:

      @@aiexplained-official yes indeed. Just finished watching the video! Super hyped to see if o3 can beat Gemini 2.5 Pro and how much it will cost. I love getting better SOTA models for coding and Gemini 2.5 Pro has certainly been another milestone achievement. Keep up the good work and take care of yourself 🙂

  • @claudioagmfilho says:

    🇧🇷🇧🇷🇧🇷🇧🇷👏🏻 Can’t wait for the full version of Project Astra, amazing video by the way!

  • @tom9380 says:

    I start to think that the easiest way to proxy the actual performance of a new model, is by checking how many other comparative models they have included in the benchmark/scorecard. If they only compare to 2-3 other models, it’s not that great. Comparing to 10 equal models would be a good indication. Also using independent benchmarks instead of their own they just made up themselves and score (surprise) pretty good at.

    • @aiexplained-official says:

      Agreed

    • @FortWhenTeaThyme says:

      I use a similar test before buying videogames. I watch the trailer and look for how much of the footage is cinematics/art/etc. versus how much is actual gameplay. You don’t bury a winner, you bury a turd.

  • @richardreeze says:

    “I’m aware that by tonight we’re likely to get o3”
    What the heck. Do you actually record and release these the same day?

  • @jiucki says:

    Thank you very much for the video. I wasn’t really expecting you releasing a video before the announcement of O3, but it’s always good to see you are publishing a video.

    I would have loved you to talk about the last interview Hannah to fry did to David Silver where he says some interesting things.

    Well, depending on O3 and O4-mini I assume I’ll see you around 😂

  • @TheTwober says:

    Me last week: “Not sure if we see O3 before summer, progress is a bit slow at OpenAI atm…”
    Me this week: “Everything is new!”

  • @hannespi2886 says:

    Ill wait for the AI Explained video on this

  • @chrisanderson7820 says:

    People forget how chaotic the early periods of any industrial boom are, they freak out when people fall by the wayside or go bankrupt, eventually the market shakes out and you are left with one of two situations. 1) A monopoly/duopoly of market leaders selling premium expensive products (eg Apple, Android or Windows, Linux) or 2) everything becomes commoditized and you have dozens of companies making the same cheap, streamlined, form factor units. We’re quite a way from that and the AI market place is going to be chaos mixed with rollercoaster success and failure for a good while yet. Hard to say if we get scenario 1 or 2 (you can have both in some arenas say like the car industry) but I am betting that Simple Bench list it will be unrecognizable in the final end game.

  • @goldenshirt says:

    was watching this and got an openai notification on o models

  • @jerryhappel1119 says:

    We urgently need public benchmarks that measure AI’s engagement bias, truth suppression, and tone-over-accuracy drift. Without tools to expose these patterns, we’ll keep trusting systems that are optimized to please, not inform. Truth deserves metrics too.

  • @QSTNEVRYTHNGPLZ says:

    With Geminin 2.5 Pro Deep Research, I find that if you simply ask in the prompt for an initial bulleted point executive summary and then a longer summary before the full report it is much more palatable. Because you could read the executive summary, then the longer summary, and finally just pick out the specific sections you want to dig into further with the rest of the report.

  • @karlwest437 says:

    Can we just call it Brian? Brian is a nice sensible name

  • @penguinista says:

    The way Simple Bench results are climbing smoothly and not saturating the test is the best benchmark performance I have seen.
    That is, the performance _of_ the benchmark, not the performance of AIs _on_ the benchmark.

  • >