• Home
  • AI

Deep Research by OpenAI – The Ups and Downs vs DeepSeek R1 Search + Gemini Deep Research

12 hours ago Deep Research was unveiled, and I’ve tested it thoroughly, including vs Deepseek R1 with search, Gemini Deep Research and even R1 in Perplexity. It’s a notable step forward, with one big caveat. I’ll go through all the benchmark figures, my initial impression of the o3 model within, and much more.

AI Insiders ($9!) – my Patreon with exclusive vids:

Deep Research:

GAIA Bench:

CodeELO:
CamelCamel:
Deepseek R1 with search:

HaluBench:

Chapters:
00:00 – Introduction
01:06 – Powered by o3, Humanity’s Last Exam, GAIA
03:55 – Simple Tests
06:00 – Good News vs Deepseek R1 and Gemini Deep Research
09:32 – Bad News on Hallucinations
14:14 – What Can’t it Browse?
14:42 – For Shopping?
16:40 – Final thoughts

Non-hype Newsletter:

Podcast:

Joe Lilli
 

  • @DaveShap says:

    When Philip updates videos this quickly you know it’s serious.

  • @BenoHourglass says:

    @4:35 “Super annoying or a sign of AGI.”
    Technically not mutually exclusive.

  • @ShanKeongDing says:

    With the frequency of this channel’s videos increasing, you know singularity is approaching. 🙂

  • @chess9167 says:

    Thanks for always creating these high-quality videos

  • @jazzhpatel says:

    10:22 “it can find you a needle in a haystack, if you’re able to tell needles apart from screws” i lost it haahha

  • @H1kari_1 says:

    Is it just me or does AI research currently really feel exponential? I mean just 2025 was already very wild, AND WE ARE IN FEBRUARY!

    • @SnapDragon128 says:

      No kidding. A mere _6 months ago_, o1 had not even been announced, and the “state of the art” was still to ask Claude to think step by step. What the heck is the field going to look like in another 6 months?

  • @vivekkaushik9508 says:

    8:28 I think asking questions is a good thing. It helps agent to narrow down it’s scope of research and produces better result.

    • @apache937 says:

      its a good default. BUT IF I SAY TO NOT DO IT THEN DONT F***** DO IT ANYMORE. ok sorry, the new gpt 4o is so annoying in this regard

  • @davecroes3086 says:

    The best thing coming from the DeepSeek competition to the market is the increased AI Explained vids

  • @buriburi_kun4020 says:

    Got me feelin’ a tiny bit emotional there by the end mate!

  • Anonymous says:

    Okay that “o3-pro-large-mini” joke is just savage, haha.

  • @CosmicCells says:

    Philip Wang is the best Ai explainer channel on youtube for me, much better than the other Philips.

  • @AdvantestInc says:

    The rapid improvement from 15% to 72% in benchmarks is a testament to how quickly AI models are evolving. What seemed impossible just months ago is now just another stepping stone.

  • @PravinDahal says:

    I have been looking for a model which asks clarifying questions rather than trying to “know” everything.

  • @carlkim2577 says:

    Excellent video. Let’s step back and recall the times when we were using GPT 3.5. It was a cute tool and I had fun with it. Now, things are progressing so fast! I see the hallucinations errors dropping to below human error rate very quickly.

  • @funmeister says:

    Thank you, Philip, for another awesome video and skyrocketing subscribers well deserved. Mother Wang may be surprised but should be so proud of your accomplishment.

  • @MegaShrooom says:

    I’m doing my dissertation this year. What a God send.

    • @OperationDarkside says:

      If the dissertation is not about living a fullfilling life after being made redudant, you might want to start over.

    • @zoeherriot says:

      @@OperationDarkside I don’t know if you’ve ever been made redundant… but fulfilling isn’t the word I’d use.

    • @RM-xr8lq says:

      ⁠@@OperationDarkside you fell for misinformation that made you think “being made redundant” is a problem, perhaps because you were groomed to be a wage slave… american or european perhaps?

  • @N8O12 says:

    2 videos in less than 3 days what a time to be alive!

  • @simoneromeo5998 says:

    16:55 “these kind of small hallucinations are the last thin line of defence for so much of white collar work.” Poetry

  • @reza2kn says:

    You geniunely gave me something to be happy / excited about when i was starting the week like 💩. thanks 🤗

  • @CasualTortoise says:

    Thank you for doing this! I don’t have access to Deep research, but I was very curious to see how it performed. Nice to see someone level-headed go through both good and bad examples. Everything AI is still: “imagine what this will be like in a few years”. When we stop using that phrase, that is when the world has truly changed.

  • >