• Home
  • AI

Leak: ‘GPT-5 exhibits diminishing returns’, Sam Altman: ‘lol’

The last few days have seen two narratives emerge. One, derived from yesterday’s OpenAI leak in TheInformation, that GPT-5/Orion is a disappointment, and less of a leap than GPT-3 to GPT-4. The second comes from a series of 4 clips (shown in this video) from Sam Altman, regarding the ‘clear path’ to AGI. Let’s go beyond the headlines (and through papers like Frontier Math) to get closer to the ground truth…

Plus Universal-2, Sora comments, Claude 3.5 Haiku SimpleBench update, and a great new AI video.

Assembly AI Speech to Text:

AI Insiders:

Chapters:
00:00 – Introduction
00:39 – Bear Case, TheInformation Leak
04:01 – Bull Case, Sam Altman
06:20 – FrontierMath
11:29 – o1 Paradigm
13:11 – Text to Video Greatness and Universal-2

TheInformation Leak:
Noam Brown Replies:
Sam Altman Y-Combinator Interview:
Altman Reply:

FrontierMath Paper:
Frontier Math Blog Post:
Tao:
MMLU Are We Done (cites me!):
Universal-2
Noam Brown ‘We don’t know’:
Anthropic Founder Response:
Sora (Runway Comment):
Sora New Vid:
Darri3D Video:

The 8 Most Controversial Terms in AI:

Non-hype Newsletter:

Podcast:

I use Descript to edit my videos:

Many people expense AI Insiders for work. Feel free to use the Template in the 'About Section' of my Patreon.

Joe Lilli
 

  • @arrogantprickly says:

    Orion’s performance will be a good test to see if we can trust future Altman hype at all.

    • @calliped-co5mj says:

      I think it will be underwhelming like o1, like sure it may be PHD level or even superhuman in benchmarks.

      but it doesn’t mean much if they don’t solve hallucinations and long term coherence, Agents need reliability.

    • @arrogantprickly says:

      @@calliped-co5mj I think o1 preview/mini is pretty great, personally, but it depends on the use case (instruction following and debugging for me). I’m just skeptical how much smarter a model can be without CoT. I think a perplexity-like model that references an external source of truth (but with tools like calculators, code execution), could solve/greatly improve hallucinations, but I don’t think that gets solved with LLMs alone.

    • @timwang4659 says:

      ​@@calliped-co5mjthese models hallucinate by design, which is why it can come up with creative ideas. There needs to be some new “grounded” architecture that can verify the outputs from the generative model.

    • @Caldaron says:

      @@calliped-co5mj 110%

    • @arrogantprickly says:

      @@timwang4659 Right. Something like perplexity works very well (external source of truth). Also, the ability to use tools (calculators, code execution, etc.) is essential.

  • @evodevo420 says:

    It’s a blessing to be able to see through all the smoke and constant headlines

    • @maciejbala477 says:

      that’s definitely my favorite aspect of following this channel. It feels like a dude who is very interested and well-versed in the field trying to give a measured overview of the latest advances in it. None of the fake sensationalism. By far the greatest strength that separates it from the lot

    • @lionelmessisburner7393 says:

      @@maciejbala477but I still always leave optimistic. I guess reality is often NOT disappointing

  • @AGIzero00 says:

    Not only is it over, we’re also so back

  • @012vinc says:

    To be honest, I find the AI hype talk from Sam Altman quite irritating and it makes him untrustworthy in my opinion. It appears like he takes on more of a marketing role than than of a technical guy.

    • @minimal3734 says:

      He isn’t a technical guy, he’s the CEO.

    • @VictorKing144 says:

      He’s not a technical guy, he’s a hype man and a grifter whose consistently been lying. If we were to believe his words, 2024 should’ve had way more impressive models than we’ve seen this year. 2024 was supposed to be the year AI took the world by storm. It is now November and I have not seen a storm yet, only a bubble.

    • @BryanBortz says:

      He should go back to wearing two collared shirts at the same time.

    • @teaveins1466 says:

      That’ll be because he does

  • @ryzikx says:

    we’re not gonna get superhuman performance (excluding speed) by training on human data. We need to solve self play on language and only then we surpass our limits.

    • @alexandermoody1946 says:

      The library of Babel is by definition incomprehensible.

    • @ossian882 says:

      One More exception: knowledge breadth (already superhuman)

    • @richardfredlund8846 says:

      the models actually need to ‘know’ things to improve. We operate from a level of functional certainty on certain things about the world. which we can then use to reason.

    • @Skoomar-sg7ej says:

      It will be superhuman since obviously no human has achieved the feat of mastering all skills known in human data. But you are right that AI will be limited against considering possibilities that humans have never considered.

    • @alexandermoody1946 says:

      @@ossian882 the Key definitions and value definitions are critical component’s, without the speed at which everything becomes a generative mess is immense.

      Even if all things considered humans are sub par in estimation there are alot of us and we have the knowledge and experience to share meaning and meaningful insights that language alone has no nuanced explaination provided.

      We have to work together to resolve this and provide the Key definitions and meanings for this to work out in the interests of all.

  • @DentoxRaindrops says:

    Thanks for your constant high quality updates, Philip! Always makes my day 🔥

  • @p-j-y-d says:

    4:34 “[In order to build AGI] I think we basically know what to do: just mining all minerals in the solar system and building a Dyson Sphere. It’ll take a while, it’ll be hard, but that’s tremendously exciting!”

    • @monad_tcp says:

      There’s a better way, just use humans and plug their brains direct to the internet. Like the Borg does.
      This way you save billions of years of training at a planetary scale done by evolution.
      We already have AGI, it’s called humans.
      The problem is that they have to be paid, isn’t ?

    • @leonardosoto4603 says:

      AGI is not that far, actually by some definitions o1 is already AGI

    • @lucasbrant9856 says:

      ​@@leonardosoto4603 by those definitions AGI is pretty underwhelming.

      O1 is cool but its not as revolutionary as AGI should be.

    • @Imperial_Squid says:

      Right? It’s like NASA saying “well we know how to get to the moon” in the 1950/60s, technically true but probably covering for an unimaginable amount of hard work that needs to happen to get there

    • @leonardosoto4603 says:

      @@lucasbrant9856 How do you define AGI ?

  • @kailohre9336 says:

    “… O1 – that might come in the next two weeks…” That was said with a mischievous smile 🤣

  • @AllisterVinris says:

    See? That’s why this is the only channel to watch on the topic of AI. Good research, informed and nuanced stance, Clearly conveyed information and Impressive reactivity.
    Welcome newcomers to AI Explained, take a seat, we’ll probably be here a while.

  • @Slayer666th says:

    All these news make it obvious to me that LLMs probably will never become AGI, but LLMs might be a part of the complex that allows AGI.
    I still firmly believe that the only way to achieve AGI is a continous input-output process that interacts with the physics of the real world.
    If that sytsem utilizes an LLM to get all the worlds knowledge, while getting logic capabilities that results from interacting with the real world, we will get AGI.

    • @danielchoritz1903 says:

      LLM’s need a personal mini model of the world, like every human has, known as believe. There is a reason, why LLM’s now act like genius kids.

    • @trevordohm6762 says:

      No matter what we do to these models they will not become AGI. We need look no further than the underlying architecture. I could give all the information in the universe and it would still be performing statistical inference.

    • @squamish4244 says:

      @@trevordohm6762 I don’t know why this is so shocking to some people. The least-hyped and least need-to-be-hyped public AI expert who is also a developer, Demis Hassabis, has always said that LLMs will not lead to AGI, since long before he was widely known. DeepMind has never bet the farm on LLMs.

    • @Xjaychax9 says:

      ​@trevordohm6762 you are performing statistical inference.

    • @furtherback6131 says:

      THIS MAN SOLVED IT

  • @awsmith1007 says:

    Llama 4 will probably tell us a lot about the future of vanilla LLMs.

  • @youriwatson says:

    You again proved to be the best AI channel. Really well done

  • @rubberducky5990 says:

    Altman is a sales guy who gives the impression of a phd without having one.

    • @AdmiralValdemar says:

      He needs to pump this stock to get another sports car or house out of it, before the bubble bursts on this dumb tech fad.

    • @EveDe-ug3zv says:

      While that is fair, Noam Brown is not a sales guy and he has seconded Altman’s statements on X yesterday

    • @unityman3133 says:

      @@AdmiralValdemar not a tech fad. Already transformers have been implemented one way or another into existing products.

    • @Caldaron says:

      reminds me of who elon musk was 10 years ago…

    • @rubberducky5990 says:

      @@EveDe-ug3zv all these maths evaluations are bs. It is like memorising past interview questions through unethical means and then claiming job proficiency through a fair process. It is not correlated to actual performance. Altman can’t say llms have no value as no business is implementing it outside the toy usecases of summarisation and fragile RAG. So, he says we are reaching limits. But to pump the stock, he has to say agi is near in the next sentence. I would rather wait to see a concrete usecase where it really works instead of gaming the system.

  • @zeol6766 says:

    Interesting that you didn’t bring up the part in that Y Combinator interview where Sam said that AGI was coming next year:
    – Interviewer: “What are you excited about in 2025? What’s to come?
    – Sam: “AGI. Excited for that”
    Needless to say, claims like these, should be taken with a grain of salt.

    • @aiexplained-official says:

      Think that was misconstrued

    • @zeol6766 says:

      @@aiexplained-official Thanks for responding. I also saw some people make that claim, and since English is my second language maybe I indeed misconstrued what he was trying to convey. But isn’t the answer “AGI” a direct response to the question “what’s to come?” meaning “AGI is what’s to come” and from the context of the previous line “… in 2025?” we assume he’s referring to next year. But maybe he just meant next year he’s exited for AGI, I guess time will tell, anyways have a wonderful day.

  • @GeneralKenobi69420 says:

    When Terrence Tao says even he can’t solve it, you know it’s the real deal

  • @GrindThisGame says:

    I think AGI / ASI is going to happen but this also feels like the tail end of an LLM bubble which will crash. There will be new breakthroughs though.

  • @toadlguy says:

    13:29 If Sora doesn’t have enough real world knowledge that it knows flamingo legs can not pass through one another, it will remain a novelty item whether released in 2 weeks or not. All the videos OpenAI have released of Sora are just creepy and no one but avant garde artists would consider them actually useful.

    • @midoavdagic9069 says:

      Everyone but avant garde artist’

      Great quote

    • @maciejbala477 says:

      it’s the same issue as always, it can do amazing things… on the surface, and not reliably. That is basically the case for every AI to date. I’d truly take notice if it was reliably performing its tasks and could be left alone to its own devices without needing constant supervision.

  • @stefano94103 says:

    That’s why research papers are so important. Most scientist use data and evaluation and less hype or fear.

  • @peersvensson9253 says:

    As a physicist, I have to interject and say that the idea of solving physics through brute intelligence is rooted in a misunderstanding of how progress is made in the natural sciences. Physics uses math, but just as math requires axioms, physics needs facts about the real world to constrain the set of all possible theories to a theory of the world we actually inhabit. The problems facing physics today (specifically the subset of physics you read about in popular science) have more to do with a lack of experimental and observational data, than with the limitations of our feeble intellects. It is also worth pointing out that the math used in physics tends to lack the kind of rigour seen in mathematics, and is often motivated by intuition or slightly loose arguments. I don’t know if that will help or hurt the utility of LLMs.

    • @dejanp8558 says:

      that’s why it seems more and more realistic to me that progress from AI in science will probably come (as Dario Amodei said in his blog about the field of Biology) mostly from accelerating discoveries related to measurement tools or techniques

    • @bahroum69 says:

      Hence the need for world models that can run millions of simulations to experiment orders of magnitude faster.

    • @technologist6102 says:

      anch’io la penso un pò come te. io credo che il cervello umano/team di cervelli umani ben allenati siano capaci di scoprire, mediante l’invenzione di tecnologie sempre pià avanzate, tutta la scienza e di capirla nel profondo. Non credo che serva una capacità cognitiva superiore a quella dell’homo sapiens per comprendere ciò che ancora non sappiamo dell’universo ed è solo una questione di tempo (150/200 anni ???) prima che si arrivi a comprendere del tutto la fisica del mondo. Secondo me dire che serva una superintelligenza artificiale per fare ciò è superfluo perché alla nostra specie non mancano le capacità cognitive necessarie per risolvere i problemi aperti della fisica o di altre discipline. Tali capacità mancano di sicuro ai gorilla o agli scimpanzè o ai nostri antenati come i Neanderthal ma non ai Sapiens. Questa è la mia visione. Si sente dire che le macchine potranno superarci in capacità ma nella realtà dei fatti non lo sappiamo: la nostra corteccia cerebrale è quella pià avanzata sul pianeta Terra e probabilmente è strettamente collegata alle capacità cognitive necessarie per risolvere i problemi prima elencati; bisogna ricordare che un piccolo aumento del numero di neuroni nella corteccia ha permesso ai sapiens di eliminare i neanderthal e prendere il sopravvento sul pianeta Terra; quel piccolo incremento ha fatto tutta la differenza.
      Però teoricamente è possibile costruire software alla Alphago in ambito matematico/fisico. Cosi come è teoricamente possibile, una volta che sarà compreso molto bene il cervello umano, costuire reti neurali in software che riproducono nel dettaglio le reti neurali biologiche della corteccia e delle altre parti del cervello. Infine è inoltre possibile spingersi più in là costruendo cervelli artificiali: si ricreano in hardware, con materiali diversi da quelli biologici, neuroni, sinapsi e altro? in modo identico e nella stessa numerosità di come sono nel cervello umano. Chissà nel penultimo e nell’ultimo caso elencati qui da me cosa verrebbe fuori da tali esperimenti!!

  • @lucaveneri313 says:

    Funny that researchers still thinking that training data is “all we need” when a standard university training was enough to make emerge all the math/engineering/physics genius in history . The basics knowledge bricks to use are already there in LLMa, is the way to the reasoning that is lacking…

    • @drakey6617 says:

      People don’t understand this. Humans have so so much less knowledge than, yet are better at research. Imagine a human with the knowledge of ChatGPT.

      That is why I also dislike that basketball comment about benchmarks. What is the point of current benchmarks if the models know the solutions to basically all problems humans have ever solved without need for thinking about them.

      These tests only make sense for humans as we assume the students have not seen the answers before.

    • @bahroum69 says:

      Thank you for explaining this so clearly. It has been my opinion since 2022.

  • >