OpenAI Backtracks, Gunning for Superintelligence: Altman Brings His AGI Timeline Closer – ’25 to ’29
Sam Altman unexpectedly brings his timelines to AGI forward, while OpenAI backtrack on superintelligence. None of these changes were heralded, but they are significant. Plus the new year brings new assessments of the true capability of models to automate 'large swathes of the economy'. I'll give my prediction on that front for 2025, announcement a new Simple Bench competition, and showcase Kling 1.6 vs Veo 2 vs Sora, and much more.
wandb.me/simple-bench
(Colab):
AI Insiders:
TheAgentCompany Paper:
Sam Altman Major Interview:
OpenAI Agent Coming Jan 2025:
Altman Singularity:
Altman Original Timeline:
OpenAI Original Emails:
DeepMind Sky News 2014 Article:
Altman Blog Reflections:
OpenAI Changes Who Gets AGI:
OpenAI 5 Levels:
Altman 2015:
OpenAI React to Anthropic:
Microsoft $100B Definition:
Epoch Scramble for Task Benchmark:
GPQA Progress:
Task Length Crucial for ARC-AGI:
RL Environment Tweet:
Jason Wei Talk:
Miles Brundage Tweet:
Jan Leike Tweet:
O1 Pro Losing Money:
Kling 1.6:
Chapters:
00:00 – Introduction
01:03 – Altman Timeline Moves Forward
04:33 – Superintelligence?
06:55 – AGI was not the only pitch
09:26 – AgentCompany and OpenAI New Agent
17:24 – SimpleBench Competition
23:03 – Kling 1.6 vs Veo 2 vs Sora
Non-hype Newsletter:
Podcast:
I can’t track Sam’s definitions on AGI. Is it matching Microsoft where it’s effectively impossible? Sigh. These goalposts are constantly in motion.
I don’t know man this video was posted 2 minutes ago, let everyone catch up
Goalposts don’t matter. Actual results matter
@JeffMcJunkin watching on 10x playback speed
Remember when passing Turing test was all the rage for decades and now no one cares what it even means?
This confuses me. The definition of AGI that was put forward in this video has been around for many years.
This is the best channel for AI news, by far.
Deep and comprehensive research on the latest developments with the technology itself, not just shallow coverage of yesterday’s drama, as so many others channels focus on.
Keep up the fantastic work!
Wow thanks fletcher
Yeah, I second this. And so far, as far as I can tell, the only one really honing in on which significant aspects of what “general intelligence” might be still absent.
(I wrote a couple of comments here, and many elsewhere on YouTube, on the current AIs seeming inability to doubt or really question themselves mid reasoning. This may tie very neatly into the lack of common-sense you discuss. )
Exactly. Nobody SHOCKED or STUNNED, just hype-free, balanced reporting.
@@WillyJuniorikr it’s so refreshing 😂
This channel and byCloud are the best AI channels out there.
I appreciate you catching out Sam on his contradictions😂
Let’s be honest though, it’s not hard.
As I have said in the comments here (ever since his testimony before Congress) and was alleged by his own Board: “Sam Altman is a Liar”. He says whatever will advance his current agenda.
Indeed. Sam looks like a sheep but he is a wolf.
Sam Altman Not Being Super Shady Challenge: IMPOSIBLE
There will come a day when you can spell 8 words in a row. I believe in you.
@@michaelwoodby5261 imposhiple
Dont lisen to him. If u can save precious energy by droping useles duplicated leters, do it!
@@michaelwoodby5261You failed. It was written by a literary genius with an IQ of 230. *Obviously* that last word is to be pronounced with a Spanish accent…
“very skilled humans” = specialized AI that is better than humans, which in certain domains we already have. He realized they can’t get generalized intelligence and changed the definition to something a lot more realistic.
Yeah, like, stockfish but way worse and for vaguely more stuff.
Or it’s because the clearest use of AGI is replacing jobs
To really be AGI, a system should be able to establish a Type III Kardashev scale civilization.
What?
“A Type III civilization is able to capture all the energy emitted by its galaxy, and every object within it, such as every star, black hole, etc.”
I think you mean type I – II and still you need ASI to achieve this level
No, he means Type III
@@pandosann it’s exaggerated on purpose to refer to the moving goalposts, and the 100 billion thing from Microsoft. So, no, he does mean III, because that’s clearly way above what an AGI should be able to achieve
Pfft, if you can’t capture all the energy from an ensemble of parallel universes then GTFO.
AT LEAST
15:05 Thank you for the shoutout! Glad you liked the article
It was amazingly well done, congrats
If AGI truly is achieved this year 2025, then that would change everything drastically and would make this year the most important in human history. Well see.
eh, sure, but also, not really, because it’s not a hard limit, it’s a spectrum. “AGI” is just an arbitrary designation which lots of people understand differently, as demonstrated by the people claiming we already have AGI now. Unless some kind of new model drops which overshadows everything else (not impossible), then it’s rather going to be a more gradual shift over several years
It’s telling that that Sam suddenly changed his tune about objectives and timelines just after the US election. Suddenly, timelines cane right in again. And the definition of AGI keeps shifting to a Microsoft tune.
Can’t wait to see how o3 scores on SimpleBench
Isn’t the question of “how” they reach their score on these benches more important than the score itself?
Is it actually reasoning? Or brute-forcing its way through without any actual understanding?
@ Chollet says o3 is a breakthrough in reasoning so idk. We don’t really know how humans reason either.
@@theWACKIIRAQI The answer to your first question is a flat “No.” Is the important thing about the tiger that it has DNA, or that it has teeth and claws?
With rare exception, we can’t look inside these systems and identify how they’re making decisions at all. We don’t know how they are doing what they are doing, but we can tell that they are extremely capable. If it is possible to solve novel PHD-level math problems, explain novel jokes, and beat most humans in the world at coding challenges, _all without reasoning,_ then we should be in awe of that ability, and treat it with the respect and caution that it deserves.
(Reasoning isn’t just one thing, of course, either in humans or in AI. It’s a bundle of useful algorithms. But that’s a story for another time.)
o3 won’t be able to be tested “fully”. Because the compute costs are way too high.
@@peter-rhodes 03 mini is as cheap as o1 mini
Must admit, I don’t expect 2025 to be quiet, the opposite if anything – I suspect this will be a pivotal year because LLMs will either look a lot more hypey, _maybe_ even to the extent of a partial bursting of the bubble, OR they’ll be doing something actually useful by the end of this year (possibly due to _some_ sort of handle on hallucinations, even if only “known percentage reliability” or similar).
It’s already been “actually useful” and accurate for me. Not sure if that’s due to our having different tasks or different expectations.
It’ll be both. We are entering the trough of disillusionment.
@@netscroogeit just indicates whatever you are doing is commonly found in the training data. I have very mixed results myself, because a lot of my work is creating novel code.
AI is doing something actually useful for the people who will replace doubters in their jobs.
@@skierpage tell me you are really into crypto without telling me
“Cookie banners are the major obstacles between us and AGI” I find that quite illuminating somehow! 😁
just an example of silly everyday “problems” that are trivial to a human, but which AI, for all its good sides, struggles with somehow. It happens a whole lot
crazy to think that in 2000 Ray Kurzweil predicted 2029 for AGI, if he really nail it will be astonishing, and will make everyone wait for singularity on his terms in 2045
There are a lot of papers/questionnaires done with “ai experts”, whole spectrum has been forecasted (pretty much every year, im guessing since the 70s).
That guy is a nutcase
If he’s right will live long enough to witness the singularity ourselves. How fkin tantalizing
@@BinaryDood an entertaining one, though
How?..@@BinaryDood
Gemini 2.0 Flash Thinking Exp 17:42
“The key information here is that the letter transforms randomly into another letter of the alphabet each second. This means that at any given moment, there are 26 possible letters. Since the transformation is random, and Husain is far away and can’t see clearly enough to discern any patterns or slight visual differences, his best strategy is essentially a random guess.
Therefore, the probability of him correctly guessing the current letter is simply 1 out of the 26 possible letters.
The correct answer is D) 1/26.
The information about the font size and distance is a distraction, setting up a scenario where Husain has to guess randomly. His past trouble with ‘w’ and ‘m’ is also a slight red herring, as he’s so far away he can’t even see the letter clearly enough to apply that previous difficulty.”
it says it will transform into *another* letter, implying it will be a different letter. Think the answer should technically be 1/25.
@CodyRay295 So in your mind a man “slowly inches away” several football field lengths in less than a second?
This tracks with my experience of the Flash 2.0 “Thinking” variant.
@@CodyRay295 Different from what? If you don’t know what the letter was before, the new letter could be any of them.
I’ve been using Gemini recently at work, specifically Experimental 1206 as I prefer larger models, and it seems to really understand nuance and novel tasks! Generally I’ll attach a bunch of documents for context (usually exported Notion pages) and a large multi step prompt like Task 1, task 2, task 3, etc. instructing to stop after each task for approval on the output. I’ve had this fail completely with gpt-4o, and Claude is usually at capacity when I try to use it but I’ve had good experience with it in the past. Fyi my boss fully supports these activities.
Brilliantly done Phillip! I am always looking SO forward to any and all of your videos, Post more often pls.
Oh, there you are!! I have been waiting for this!!
This is superb video with tons of extremely good reference. What a beautiful work you are doing curating all this to us. Good luck with SimpleBench!
Thanks! Excellent content, as always. 🙏🏼
*A CEO saying anything should NOT be the key reference* …..His job is to present their progress and potential in the most attractive and postive way possible to keep investors interested and the media hype primed.
13:48 This is the reason why I watch the best AI YouTube channel – AI Explained
Are you talking about the Sam’s sister sexual abuse allegation….?