Home →
AI →
Deep Research by OpenAI – The Ups and Downs vs DeepSeek R1 Search + Gemini Deep Research

Deep Research by OpenAI – The Ups and Downs vs DeepSeek R1 Search + Gemini Deep Research

12 hours ago Deep Research was unveiled, and I’ve tested it thoroughly, including vs Deepseek R1 with search, Gemini Deep Research and even R1 in Perplexity. It’s a notable step forward, with one big caveat. I’ll go through all the benchmark figures, my initial impression of the o3 model within, and much more.

AI Insiders ($9!) – my Patreon with exclusive vids:

Deep Research:

GAIA Bench:

CodeELO:
CamelCamel:
Deepseek R1 with search:

HaluBench:

Chapters:
00:00 – Introduction
01:06 – Powered by o3, Humanity’s Last Exam, GAIA
03:55 – Simple Tests
06:00 – Good News vs Deepseek R1 and Gemini Deep Research
09:32 – Bad News on Hallucinations
14:14 – What Can’t it Browse?
14:42 – For Shopping?
16:40 – Final thoughts

Non-hype Newsletter:

Podcast:

You Need to Try This New AI Agent (Genspark Super Agent)

o3 and o4-mini – they’re great, but easy to over-hype

OpenAI’s GPT 4.1 – Absolutely Amazing!

‘Speaking Dolphin’ to AI Data Dominance, 4.1 + Kling 2.0: 7 Updates Critically Analysed

When You Combine 3 AIs, You Get THIS

Runway Finally Released Gen-4 Video AI 🤯

Massive Breakthrough in Understanding AI 🤯

ChatGPT Now Remembers EVERYTHING About You & More AI Use Cases

Joe Lilli

@DaveShap says:

February 3, 2025 at 4:10 pm

When Philip updates videos this quickly you know it’s serious.

@lmnatrix6226 says:

February 3, 2025 at 4:37 pm

Well he is a Serius man 😉.

Reply
@dertythegrower says:

February 3, 2025 at 4:44 pm

Always trying to ride other coat tails Dave 🤦‍♂️

Reply
@dertythegrower says:

February 3, 2025 at 4:46 pm

This guy knows he can only get subscribers by commenting with his name verified boost to get attention on himself.. also your previous reply proves my point so dont try to say otherwise dude.

Reply
@eti-iniER says:

February 3, 2025 at 4:48 pm

Gotta save as much dough as he can before he’s replaced by AI, I guess 😅

Reply
@DomainAspect says:

February 3, 2025 at 5:37 pm

@@dertythegrower, so what? You’re leeching of attention from him by being a Reddit tier contraction 🤓😂

Reply

@BenoHourglass says:

February 3, 2025 at 4:14 pm

@4:35 “Super annoying or a sign of AGI.”
Technically not mutually exclusive.

@ShanKeongDing says:

February 3, 2025 at 4:16 pm

With the frequency of this channel’s videos increasing, you know singularity is approaching. 🙂

@chess9167 says:

February 3, 2025 at 4:18 pm

Thanks for always creating these high-quality videos

@jazzhpatel says:

February 3, 2025 at 4:18 pm

10:22 “it can find you a needle in a haystack, if you’re able to tell needles apart from screws” i lost it haahha

@H1kari_1 says:

February 3, 2025 at 4:27 pm

Is it just me or does AI research currently really feel exponential? I mean just 2025 was already very wild, AND WE ARE IN FEBRUARY!

@SnapDragon128 says:

February 3, 2025 at 5:32 pm

No kidding. A mere _6 months ago_, o1 had not even been announced, and the “state of the art” was still to ask Claude to think step by step. What the heck is the field going to look like in another 6 months?

Reply

@vivekkaushik9508 says:

February 3, 2025 at 4:28 pm

8:28 I think asking questions is a good thing. It helps agent to narrow down it’s scope of research and produces better result.

@apache937 says:

February 3, 2025 at 5:43 pm

its a good default. BUT IF I SAY TO NOT DO IT THEN DONT F***** DO IT ANYMORE. ok sorry, the new gpt 4o is so annoying in this regard

Reply

@davecroes3086 says:

February 3, 2025 at 4:30 pm

The best thing coming from the DeepSeek competition to the market is the increased AI Explained vids

@imperfectmammal2566 says:

February 3, 2025 at 5:01 pm

It destroyed the tech bro hype showing AI doesn’t deserve the billions

Reply

@buriburi_kun4020 says:

February 3, 2025 at 4:34 pm

Got me feelin’ a tiny bit emotional there by the end mate!

@aiexplained-official says:

February 3, 2025 at 4:36 pm

Me too man

Reply

Anonymous says:

February 3, 2025 at 4:37 pm

Okay that “o3-pro-large-mini” joke is just savage, haha.

@dltn42 says:

February 3, 2025 at 4:49 pm

Large mimi 😂

Reply
@steve9233 says:

February 3, 2025 at 4:52 pm

not really it’s mini because it’s watered down

Reply

@CosmicCells says:

February 3, 2025 at 4:38 pm

Philip Wang is the best Ai explainer channel on youtube for me, much better than the other Philips.

@netscrooge says:

February 3, 2025 at 5:10 pm

Very funny.

Reply
@surfingdiamond says:

February 3, 2025 at 5:43 pm

Dear algorithm, please note that Philip Wang is the name of this YouTuber. The author behind AI Explained is Philip Wang. This is a true fact.

Reply
@newplayer1313 says:

February 3, 2025 at 5:54 pm

The comment above is a lie, Philip Wang is Donald Trump, the current United States of America president. This statement is true

Reply

@AdvantestInc says:

February 3, 2025 at 4:46 pm

The rapid improvement from 15% to 72% in benchmarks is a testament to how quickly AI models are evolving. What seemed impossible just months ago is now just another stepping stone.

@nikozg2091 says:

February 3, 2025 at 5:56 pm

It’s because the AIs are specifically being engineered to do better on these benchmarks

Reply

@PravinDahal says:

February 3, 2025 at 4:47 pm

I have been looking for a model which asks clarifying questions rather than trying to “know” everything.

@carlkim2577 says:

February 3, 2025 at 4:53 pm

Excellent video. Let’s step back and recall the times when we were using GPT 3.5. It was a cute tool and I had fun with it. Now, things are progressing so fast! I see the hallucinations errors dropping to below human error rate very quickly.

@DynamicLights says:

February 3, 2025 at 5:06 pm

Man same.

Reply

@funmeister says:

February 3, 2025 at 4:54 pm

Thank you, Philip, for another awesome video and skyrocketing subscribers well deserved. Mother Wang may be surprised but should be so proud of your accomplishment.

@MegaShrooom says:

February 3, 2025 at 4:57 pm

I’m doing my dissertation this year. What a God send.

@OperationDarkside says:

February 3, 2025 at 5:12 pm

If the dissertation is not about living a fullfilling life after being made redudant, you might want to start over.

Reply
@zoeherriot says:

February 3, 2025 at 5:16 pm

@@OperationDarkside I don’t know if you’ve ever been made redundant… but fulfilling isn’t the word I’d use.

Reply
@RM-xr8lq says:

February 3, 2025 at 5:57 pm

⁠@@OperationDarkside you fell for misinformation that made you think “being made redundant” is a problem, perhaps because you were groomed to be a wage slave… american or european perhaps?

Reply

@N8O12 says:

February 3, 2025 at 5:02 pm

2 videos in less than 3 days what a time to be alive!

@simoneromeo5998 says:

February 3, 2025 at 5:12 pm

16:55 “these kind of small hallucinations are the last thin line of defence for so much of white collar work.” Poetry

@reza2kn says:

February 3, 2025 at 5:16 pm

You geniunely gave me something to be happy / excited about when i was starting the week like 💩. thanks 🤗

@CasualTortoise says:

February 3, 2025 at 5:34 pm

Thank you for doing this! I don’t have access to Deep research, but I was very curious to see how it performed. Nice to see someone level-headed go through both good and bad examples. Everything AI is still: “imagine what this will be like in a few years”. When we stop using that phrase, that is when the world has truly changed.

Deep Research by OpenAI – The Ups and Downs vs DeepSeek R1 Search + Gemini Deep Research

Related Posts

Joe Lilli