Leak: ‘GPT-5 exhibits diminishing returns’, Sam Altman: ‘lol’
The last few days have seen two narratives emerge. One, derived from yesterday’s OpenAI leak in TheInformation, that GPT-5/Orion is a disappointment, and less of a leap than GPT-3 to GPT-4. The second comes from a series of 4 clips (shown in this video) from Sam Altman, regarding the ‘clear path’ to AGI. Let’s go beyond the headlines (and through papers like Frontier Math) to get closer to the ground truth…
Plus Universal-2, Sora comments, Claude 3.5 Haiku SimpleBench update, and a great new AI video.
Assembly AI Speech to Text:
AI Insiders:
Chapters:
00:00 – Introduction
00:39 – Bear Case, TheInformation Leak
04:01 – Bull Case, Sam Altman
06:20 – FrontierMath
11:29 – o1 Paradigm
13:11 – Text to Video Greatness and Universal-2
TheInformation Leak:
Noam Brown Replies:
Sam Altman Y-Combinator Interview:
Altman Reply:
FrontierMath Paper:
Frontier Math Blog Post:
Tao:
MMLU Are We Done (cites me!):
Universal-2
Noam Brown ‘We don’t know’:
Anthropic Founder Response:
Sora (Runway Comment):
Sora New Vid:
Darri3D Video:
The 8 Most Controversial Terms in AI:
Non-hype Newsletter:
Podcast:
I use Descript to edit my videos:
Many people expense AI Insiders for work. Feel free to use the Template in the 'About Section' of my Patreon.
Orion’s performance will be a good test to see if we can trust future Altman hype at all.
I think it will be underwhelming like o1, like sure it may be PHD level or even superhuman in benchmarks.
but it doesn’t mean much if they don’t solve hallucinations and long term coherence, Agents need reliability.
@@calliped-co5mj I think o1 preview/mini is pretty great, personally, but it depends on the use case (instruction following and debugging for me). I’m just skeptical how much smarter a model can be without CoT. I think a perplexity-like model that references an external source of truth (but with tools like calculators, code execution), could solve/greatly improve hallucinations, but I don’t think that gets solved with LLMs alone.
@@calliped-co5mjthese models hallucinate by design, which is why it can come up with creative ideas. There needs to be some new “grounded” architecture that can verify the outputs from the generative model.
@@calliped-co5mj 110%
@@timwang4659 Right. Something like perplexity works very well (external source of truth). Also, the ability to use tools (calculators, code execution, etc.) is essential.
It’s a blessing to be able to see through all the smoke and constant headlines
that’s definitely my favorite aspect of following this channel. It feels like a dude who is very interested and well-versed in the field trying to give a measured overview of the latest advances in it. None of the fake sensationalism. By far the greatest strength that separates it from the lot
@@maciejbala477but I still always leave optimistic. I guess reality is often NOT disappointing
Not only is it over, we’re also so back
Pris knows what’s up
Not only is it back, we are so over.
lol
Schrödinger’s comeback.
Not only is it over, we’re also so back. The plot thickens like you wouldn’t believe. Brace for impact the game’s changed.
Not only have I said nothing, I have given you brainrot.
To be honest, I find the AI hype talk from Sam Altman quite irritating and it makes him untrustworthy in my opinion. It appears like he takes on more of a marketing role than than of a technical guy.
He isn’t a technical guy, he’s the CEO.
He’s not a technical guy, he’s a hype man and a grifter whose consistently been lying. If we were to believe his words, 2024 should’ve had way more impressive models than we’ve seen this year. 2024 was supposed to be the year AI took the world by storm. It is now November and I have not seen a storm yet, only a bubble.
He should go back to wearing two collared shirts at the same time.
That’ll be because he does
we’re not gonna get superhuman performance (excluding speed) by training on human data. We need to solve self play on language and only then we surpass our limits.
The library of Babel is by definition incomprehensible.
One More exception: knowledge breadth (already superhuman)
the models actually need to ‘know’ things to improve. We operate from a level of functional certainty on certain things about the world. which we can then use to reason.
It will be superhuman since obviously no human has achieved the feat of mastering all skills known in human data. But you are right that AI will be limited against considering possibilities that humans have never considered.
@@ossian882 the Key definitions and value definitions are critical component’s, without the speed at which everything becomes a generative mess is immense.
Even if all things considered humans are sub par in estimation there are alot of us and we have the knowledge and experience to share meaning and meaningful insights that language alone has no nuanced explaination provided.
We have to work together to resolve this and provide the Key definitions and meanings for this to work out in the interests of all.
Thanks for your constant high quality updates, Philip! Always makes my day 🔥
Thanks Dentox
4:34 “[In order to build AGI] I think we basically know what to do: just mining all minerals in the solar system and building a Dyson Sphere. It’ll take a while, it’ll be hard, but that’s tremendously exciting!”
There’s a better way, just use humans and plug their brains direct to the internet. Like the Borg does.
This way you save billions of years of training at a planetary scale done by evolution.
We already have AGI, it’s called humans.
The problem is that they have to be paid, isn’t ?
AGI is not that far, actually by some definitions o1 is already AGI
@@leonardosoto4603 by those definitions AGI is pretty underwhelming.
O1 is cool but its not as revolutionary as AGI should be.
Right? It’s like NASA saying “well we know how to get to the moon” in the 1950/60s, technically true but probably covering for an unimaginable amount of hard work that needs to happen to get there
@@lucasbrant9856 How do you define AGI ?
“… O1 – that might come in the next two weeks…” That was said with a mischievous smile 🤣
See? That’s why this is the only channel to watch on the topic of AI. Good research, informed and nuanced stance, Clearly conveyed information and Impressive reactivity.
Welcome newcomers to AI Explained, take a seat, we’ll probably be here a while.
no
Totally this is the second most sensible video that I have seen since the AI madness began 3 years ago. The first being the linus’s video on AI.
All these news make it obvious to me that LLMs probably will never become AGI, but LLMs might be a part of the complex that allows AGI.
I still firmly believe that the only way to achieve AGI is a continous input-output process that interacts with the physics of the real world.
If that sytsem utilizes an LLM to get all the worlds knowledge, while getting logic capabilities that results from interacting with the real world, we will get AGI.
LLM’s need a personal mini model of the world, like every human has, known as believe. There is a reason, why LLM’s now act like genius kids.
No matter what we do to these models they will not become AGI. We need look no further than the underlying architecture. I could give all the information in the universe and it would still be performing statistical inference.
@@trevordohm6762 I don’t know why this is so shocking to some people. The least-hyped and least need-to-be-hyped public AI expert who is also a developer, Demis Hassabis, has always said that LLMs will not lead to AGI, since long before he was widely known. DeepMind has never bet the farm on LLMs.
@trevordohm6762 you are performing statistical inference.
THIS MAN SOLVED IT
Llama 4 will probably tell us a lot about the future of vanilla LLMs.
Why
@@Xjaychax9 bcz its open source?
@Tahazif_TheCool22 an LLM is not open source. Nobody truly understands how an LLM works. We can design it, but it’s internal thinking/processing is little understood. So no, on this basis we won’t know the future of LLM’s based of Llama 4.
You again proved to be the best AI channel. Really well done
Altman is a sales guy who gives the impression of a phd without having one.
He needs to pump this stock to get another sports car or house out of it, before the bubble bursts on this dumb tech fad.
While that is fair, Noam Brown is not a sales guy and he has seconded Altman’s statements on X yesterday
@@AdmiralValdemar not a tech fad. Already transformers have been implemented one way or another into existing products.
reminds me of who elon musk was 10 years ago…
@@EveDe-ug3zv all these maths evaluations are bs. It is like memorising past interview questions through unethical means and then claiming job proficiency through a fair process. It is not correlated to actual performance. Altman can’t say llms have no value as no business is implementing it outside the toy usecases of summarisation and fragile RAG. So, he says we are reaching limits. But to pump the stock, he has to say agi is near in the next sentence. I would rather wait to see a concrete usecase where it really works instead of gaming the system.
Interesting that you didn’t bring up the part in that Y Combinator interview where Sam said that AGI was coming next year:
– Interviewer: “What are you excited about in 2025? What’s to come?
– Sam: “AGI. Excited for that”
Needless to say, claims like these, should be taken with a grain of salt.
Think that was misconstrued
@@aiexplained-official Thanks for responding. I also saw some people make that claim, and since English is my second language maybe I indeed misconstrued what he was trying to convey. But isn’t the answer “AGI” a direct response to the question “what’s to come?” meaning “AGI is what’s to come” and from the context of the previous line “… in 2025?” we assume he’s referring to next year. But maybe he just meant next year he’s exited for AGI, I guess time will tell, anyways have a wonderful day.
When Terrence Tao says even he can’t solve it, you know it’s the real deal
He can’t solve what exactly ?
@@taopaille-paille4992the test problems
@@taopaille-paille4992 Complex AI math problems. Let’s just leave it at that. Detailed descriptions of this stuff would sound like Alien speech to us anyway lmao
@@kaystephan2610 Well speak for yourself , I have 3 master degrees in maths from top university , and worked as a quant in finance
@@kaystephan2610 math goes from numbers, to letters, to greek, to gibberish real fast :>
I think AGI / ASI is going to happen but this also feels like the tail end of an LLM bubble which will crash. There will be new breakthroughs though.
Morse code to telephone to cellphone.
13:29 If Sora doesn’t have enough real world knowledge that it knows flamingo legs can not pass through one another, it will remain a novelty item whether released in 2 weeks or not. All the videos OpenAI have released of Sora are just creepy and no one but avant garde artists would consider them actually useful.
Everyone but avant garde artist’
Great quote
it’s the same issue as always, it can do amazing things… on the surface, and not reliably. That is basically the case for every AI to date. I’d truly take notice if it was reliably performing its tasks and could be left alone to its own devices without needing constant supervision.
That’s why research papers are so important. Most scientist use data and evaluation and less hype or fear.
As a physicist, I have to interject and say that the idea of solving physics through brute intelligence is rooted in a misunderstanding of how progress is made in the natural sciences. Physics uses math, but just as math requires axioms, physics needs facts about the real world to constrain the set of all possible theories to a theory of the world we actually inhabit. The problems facing physics today (specifically the subset of physics you read about in popular science) have more to do with a lack of experimental and observational data, than with the limitations of our feeble intellects. It is also worth pointing out that the math used in physics tends to lack the kind of rigour seen in mathematics, and is often motivated by intuition or slightly loose arguments. I don’t know if that will help or hurt the utility of LLMs.
that’s why it seems more and more realistic to me that progress from AI in science will probably come (as Dario Amodei said in his blog about the field of Biology) mostly from accelerating discoveries related to measurement tools or techniques
Hence the need for world models that can run millions of simulations to experiment orders of magnitude faster.
anch’io la penso un pò come te. io credo che il cervello umano/team di cervelli umani ben allenati siano capaci di scoprire, mediante l’invenzione di tecnologie sempre pià avanzate, tutta la scienza e di capirla nel profondo. Non credo che serva una capacità cognitiva superiore a quella dell’homo sapiens per comprendere ciò che ancora non sappiamo dell’universo ed è solo una questione di tempo (150/200 anni ???) prima che si arrivi a comprendere del tutto la fisica del mondo. Secondo me dire che serva una superintelligenza artificiale per fare ciò è superfluo perché alla nostra specie non mancano le capacità cognitive necessarie per risolvere i problemi aperti della fisica o di altre discipline. Tali capacità mancano di sicuro ai gorilla o agli scimpanzè o ai nostri antenati come i Neanderthal ma non ai Sapiens. Questa è la mia visione. Si sente dire che le macchine potranno superarci in capacità ma nella realtà dei fatti non lo sappiamo: la nostra corteccia cerebrale è quella pià avanzata sul pianeta Terra e probabilmente è strettamente collegata alle capacità cognitive necessarie per risolvere i problemi prima elencati; bisogna ricordare che un piccolo aumento del numero di neuroni nella corteccia ha permesso ai sapiens di eliminare i neanderthal e prendere il sopravvento sul pianeta Terra; quel piccolo incremento ha fatto tutta la differenza.
Però teoricamente è possibile costruire software alla Alphago in ambito matematico/fisico. Cosi come è teoricamente possibile, una volta che sarà compreso molto bene il cervello umano, costuire reti neurali in software che riproducono nel dettaglio le reti neurali biologiche della corteccia e delle altre parti del cervello. Infine è inoltre possibile spingersi più in là costruendo cervelli artificiali: si ricreano in hardware, con materiali diversi da quelli biologici, neuroni, sinapsi e altro? in modo identico e nella stessa numerosità di come sono nel cervello umano. Chissà nel penultimo e nell’ultimo caso elencati qui da me cosa verrebbe fuori da tali esperimenti!!
Funny that researchers still thinking that training data is “all we need” when a standard university training was enough to make emerge all the math/engineering/physics genius in history . The basics knowledge bricks to use are already there in LLMa, is the way to the reasoning that is lacking…
People don’t understand this. Humans have so so much less knowledge than, yet are better at research. Imagine a human with the knowledge of ChatGPT.
That is why I also dislike that basketball comment about benchmarks. What is the point of current benchmarks if the models know the solutions to basically all problems humans have ever solved without need for thinking about them.
These tests only make sense for humans as we assume the students have not seen the answers before.
Thank you for explaining this so clearly. It has been my opinion since 2022.