• Home
  • AI

Altman Expects a ‘Fast Take-off’, ‘Super-Agent’ Debuting Soon and DeepSeek R1 Out

OpenAI looks set to debut their Operator system, and some leaks are out. At the same time Deepseek R1 releases some numbers, and Sam Altman says he might have been wrong before, and now anticipates a 'fast take-off'. Plus two papers to give you an idea of what a super-agent might be decent at doing, some more exclusive article analysis and much more. Who said anything else is happening today…

80,000 Hours Channel:
Spotify:

AI Insiders ($9!):

Chapters:
00:00 – Introduction
01:13 – Pro Cost and OpenAI Operator
04:00 – Agent Benchmarks Being Targeted
07:48 – Fast Take-off, Altman
08:48 – Altman flip-flops
10:02 – Deepseek R1 First Reaction

Altman ‘100x expectations out of control’:
OpenAI Operator Table:
WebVoyager:
OSWorld:
Axios Exclusive 1 (Super-Agent):
Axios Exclusive 2:
Deepseek R1 Numbers:
Does 1.5B outperform 3.5 Sonnet on Math?:
Deepseek R1 (deepseek-reasoner) Pricing:
Altman Fast Takeoff:
OpenAI Economic Blueprint:
Target is Long-horizon Tasks:
Support Regulations:

Donation:
Amodei on Regulations by 2025:
‘Feel the AGI’:
GPT-5 and o-series merger:
o1 Thinks in Chinese:

Non-hype Newsletter:

Podcast:

Joe Lilli
 

  • @cariyaputta says:

    R1 is 100% free and unlimited on their chat platform, and the API is dirt cheap too. Insane. They can even correctly answer this prompt, which o1 can’t:
    “Write a haiku where the second letter of each word when put together spells ‘SIMPLE'”

    Regarding coding, Aider with the pair deepseek-reason and deepseek-chat on Architect mode will be insane.

  • @elibullockpapa9012 says:

    Good timing. Distraction from inauguration

  • @kylemorris5338 says:

    Excited to hear about the coding project you mentioned! Interesting that, for the moment, sonnet still beats o1 in professional use cases.

  • @B_MoreJ says:

    As a Help Desk Technician, there goes my job. I’ve been out of work, for this level of the industry since November 2024.

    • @DivergentIntegral says:

      I did a similar job for a few months back in 2019. And even then, way before ChatGPT became a thing, I already had a feeling that much of the work I did could eventually be automated away.

  • @dakara4877 says:

    Fast Take-off and we have virtually stopped talking about alignment.

    • @jyjjy7 says:

      Thank God, an AI “aligned” with a species constantly killing each other at scale and rapidly knowingly destroying their own environment is terrifying

    • @dakara4877 says:

      @@jyjjy7 Indeed. All alignment “options” for powerful AI are only bad outcomes.

    • @shahadeva says:

      I’m glad about that. I hope OpenAI finally stops wasting money on alignment research and focuses on increasing capabilities.

  • @aalluubbaa says:

    I really don’t trust bench marks at all now. I’m an AI coder which basically means that I knew nothing when I started to code. I found Claude 3.5 sonnet the most superior tool by far in my experience.

    We really need a diverse range of benchmarks for models as we accelerate because everyone has different needs. A coder who knows some really basic stuff may find other models more helpful.

    It’s really a tricky question/ benchmark. Would you hire a mathematician to teach your 3 year old basic math or someone who’s more experienced with teaching young kids?

    • @toocrazy4030 says:

      Im curious in how far you get without any coding knowledge? I started coding about 6-7 years ago and I rarely find it helpfull to use AI to code. You probably already learned a bit about coding using the AI by now, but what are the problems you face when using codeassistants?

  • @jonp3674 says:

    Paper has been out for 90 minutes and you haven’t read the whole thing??

    I was shocked, your relentless excellence has trained me to expect you to have always read everything moments after release haha.

  • @DanscoDude says:

    My benchmark for ASI is when all the career pages for AI companies are blank

  • @DavidsKanal says:

    The quality of the new Sonnet 3.5 still blows my mind, how it’s able to compete with OpenAI’s reasoning models. That’s why I’m also more hyped for Anthropic’s next move than OpenAI’s.

  • @DanBarbatti says:

    I tend to find Deepseek concerning . Either the Chinese have been able to match or nearly match the progress in the west without the latest chips meaning they have better algorithms or they have secretly gotten a large number of the latest chips illegally. I guess time will tell.

  • @shApYT says:

    Can’t wait for this to turn into one of the biggest RCE vulnerability in the world.

  • @BigSources says:

    “buy me the best esports gaming mouse on the market”
    Operator: *buys “ultimate esports super mouse” for $10 on temu*

  • @wfhw57 says:

    Hell of a way to start a Monday. Hosting a workshop on LLMs for several friends who are academics, and it feels like the ground is shifting under our feet continuously. This channel is a great resource as always

  • @millenialmusings8451 says:

    What is the end-game here? Economy will collapse if AI takes over even 20-30% white collar jobs in next 5 years. The last time we had such high unemployment was during the great depression. UBI is not going to solve this problem! AI might lead to a violent revolution against oligarchy.. What say?

    • @nuigulumarZ says:

      UBI could solve the problem, if there was the will and a massive cultural shift. Not optimistic that will happen quickly or easily, but we’ve seen how bad human societies are at managing long horizon problems like climate change, so if it is a problem that’s heading our way maybe it’s better for it to come as a system shock than a slow collapse.

    • @ciekawki6574 says:

      Uniwersal basic income. But don’t even think you’re going to afford much goodies and services

  • @BananaBreakdown says:

    2:53 small flex having Noam Broom following you haha

  • @IntellectCorner says:

    I use Descript as well. Same pinch 😊 5:10

  • @micbab-vg2mu says:

    My two main models that I use at the moment are the O1 Pro and the Gemini 2.0 Flash. I’m currently waiting for the O3 and other advancements. Thanks for your great videos! 🙂

  • @wobber17 says:

    I’ll believe it when I see it.

  • @nuigulumarZ says:

    Not sure why they’re asking agents to click a mouse to navigate a website – HTML is structured and rich with markup and metadata, it would make a lot more sense for an agent to interact at the DOM level than with pixels!

  • @nashh600 says:

    When do you think Anthropic’s next release is? Do you think they are working on a CoT model too?

  • >