o1 Pro Mode – ChatGPT Pro Full Analysis (plus o1 paper highlights)
Oh boy. o1 pro mode out on the same night as o1 full. I read the 49 page paper, ran my own tests, spent my fuel allowance on Pro Mode and will give you all the highlights. Suffice to say the story is not as simple as it first appears.
Weights and Biases’ Weave: wandb.me/ai_explained
Plus, GPT-4.5? MLE Bench, Simple Update, Image Analysis and much more
AI Insiders:
o1 System Card:
Apollo Research:
Altman Tweet:
ChatGPT Pro:
Tibor Blaho:
Simple-bench.com
Chapters:
00:00 – Introduction
00:27 – ChatGPT Pro is $200
01:25 – OpenAI Benchmarks
03:20 – o1 System Card, o1 and o1 Pro Mode vs o1-preview
06:18 – Simple Bench surprising results on sample
08:31 – Weight & Biases
09:05 – Image Analysis Compared
12:51 – More Benchmarks and Safety
The 8 Most Controversial Terms in AI:
Non-hype Newsletter:
Podcast:
I use Descript to edit my videos:
Many people expense AI Insiders for work. Feel free to use the Template in the 'About Section' of my Patreon.
Congrats on 300k!! Well deserved
Thanks Roy!
oh boy, can’t wait for Pro Mode+ for only $1999.99 per month!
Which will be the most miniscule improvement.
I think it comes with the robot
Yup, only crappy models are cheap. When Ai can actually change the world it will be behind a paywall so steep only very rich can afford it. The masses are subsidizing the cost of training their replacements.
literally saving up at this moment. i WILL be ready to drop 2k and make my money back. real hustlers know
You’re joking, but I actually can’t wait for that. If they eventually have models smart enough to justify spending $2k/mo I will be their first customer. People underestimate the value if intelligence to an amazing degree
I was naively under the impression that being an OpenAI subscriber would give me access to the newest and coolest models. lol.
Well, if o1 pro mode is just majority voting utilising o1, then you do technically have the newest model o1 still (full o1 has rolled out to all plus/team/pro users now) just not in the exact same setup lol.
If I’m understanding how pro works, you should be able to iterate o1 results on themselves with reflective prompting and be able to get an answer approaching the pro result.
you still get access to o1
@@logan27000 but usually you only get a certain number of calls to the more advanced models. So a DIY pro model might be anything from impractical to impossible.
I mean look, I’m not mad or anything. It’s the symbolism of it all that I think is more of a problem for OpenAI than it is a problem for me to not get the super wasteful mode of their best model.
@@logan27000 I actually don’t think that’s how o1-pro works, but none of us have any real evidence yet so I can’t say for sure.
Thanks for always bringing the edge news in GenAI! Even as an NLP researcher, I’m finding it difficult to keep myself updated on all the news around LLMs, so these are a huge help!!
your suggested tic tac toe move is just a symmetry of the ai suggestion…
I figured there was someone coming in the comments to confirm what I saw there too
^ I would rather have speedy info with a few errors than having to wait though!
agreed, any side can achieve a draw, any corner loses.
Yea hahah that was funny
He almost had me, I was like yea duh… wait a minute….
10:50 im pretty sure theres no difference in picking bottom left or top right they are both losing lol
pretty sure?
I thought it was a joke…. Has to be a joke! Right?
Nothing will drive open source innovation more than a $200/month competitor with no moat.
“open source competitor” that has a 500 Million server farm? how? 2x RTX 5000 to run good models locally… ? lol
@@lowruna It’s only efficiencies in the way until it runs locally. The first computers were huge like this.
@@lowruna none of know the compute demands of o1. It might still be doable in lower grade hardware, it’s just a matter of cracking the architecture. At least in theory. OpenAI might be pricing up 2000% for the novelty and exclusivity. We don’t know
It’s crazy how they’ll continue being successful and make millions of dollars despite your extremely witty sarcastic criticism.. It’s amazing how people always think they know best. I work at a Fortune 500 company and we held a meeting a few hours ago already discussing how to quickly enroll my entire department of 700+ people into ChatGPT Pro. Just because you don’t personally find it worthwhile to pay for doesn’t mean no one else does.
@@thr0w407 and it took 50 years till now
Not only does it cost as much as heating your house in winter, I’m pretty sure if those servers were situated in your living room, they _would_ heat your house in winter!
with how much compute it takes, probably more like set the house on fire
@@GeekProdigyGuywith how much money it costs, probably more like set fire in your bank account
How much “playing around” with technology like AI is legitimate by considering the sheer amount of energy consumption needed for computing? Would be a nice feature of OpenAI as well as all the other vendors to show the consumption of a single prompt as well as the overall energy consumption .
@@ML_Machine_Learning Can extrapolate this to the entire web to be fair. So many sites are so much more energetically heavy than they need to. The amount of terrible javascript out there is absolutely mind boggling. Websites as well should have a CO2 / Energy use stat on them.
People are actually starting to combined these functions. Using data-centres to heat buildings and swimming pools. Or compute nodes to heat homes. It makes a lot of sense.
10:30 The tic-tac-toe is symmetrical, so I don’t see the difference between the thing that you’d pick (bottom left corner) and the AI answer you state (top right corner).
In fact, circle needs to take a side position (to force X to block), not the corner you indicate with your mouse. Then, it naturally ends in a draw.
picture Kslnayb.png on imgur for optimally played game by both sides, though there are 4 different optimal games I think
That’s exactly what I was thinking.
I’m glad that i didnt have to write that comment haha
@@Shunarjuna yeah same. Kind of hilariously harsh to insult the AI not doing well if you can’t even notice such simple things..
+1
This is the review I was waiting for! Thanks for the hard work!
ChatGPT 4 Pro Max Ultra (Titanium)
they moved away from 4 branding.
It’s like street fighter two. Every update is just considered a different product.
Haha
RTX ON
Haha. Good. Hermes.
$200? No thanks, I’m happy with Claude. Thanks for the evaluation.
It’s $20/ month. Only Pro mode is $200/ month.
To get equivalent “unlimited” use with Claude you need to pay for a $150/month Team plan.
@@MattGreenfield Do you pay for both? I do. Claude is more limited comparing to GPT plus subcribtion. I rarely run out out gpt 4o usage.
It’s alright, you were never the target audience in the first place.
@@hongdouliu4381 I pay $150/month for Claude Team plan and $20/month for ChatGPT Plus.
I use Claude for work all day (so probably 4-6 hours sustained use each day), and use ChatGPT Search for smart web search and GPT-4o Advanced Voice Mode for getting lectures on interesting topics when I go for walks or runs.
So basically: Claude for when I need the smartest brain, and ChatGPT for its clever features that Claude doesn’t have yet. So far o1-preview hasn’t proven itself to the be the smartest brain overall – Claude still wins. Will be interesting to see how proper o1 does, but I’m expecting Claude to still come out on top.
Thanks for bringing everyone down to earth with this realistic assessment.
4:25 “until you realize that this is reddit” lmao
Everyone caught a stray.
@@MrTintedshades caught astray*
yup
@anonymous-de3mn incorrect. A stray is in reference to a stray bullet from a shot taken. Since many of us use Reddit, but are not of the targeted subreddit, we all caught a stray shot.
Most people are misunderstanding the new price. Yes, the model is NOT 10x better, but you have unlimited access. They simply cannot give everyone unlimited access with their current infrastructure, so the solution they found is to gate keep it behind a price that is inaccessible for the majority of people.
Did you just ask people to have critical thinking?
The people you are appealing for understanding lack critical thinking, therefore you just wasted your time.
why not make a middle tier then? along with pro plan. 100 messages per week at 5 – 10$ more. would make it not too restrictive.
Similarly, to get unlimited use of Claude Sonnet 3.5 I’ve had to upgrade to a Claude “Team” plan, at $150/month.
To get that unlimited use I have to switch between the [required minimum] five “team member” accounts in the Team plan every couple of hours throughout the day, as each one gets rate limited, logging out of one then in to another.
So Anthropic are kind of already offering a similar “unlimited” plan at a similar price, albeit with awkward login switching required.
Is it really unlimited?
@@rccsab what kind of logic is that? “Oh sorry, we would like to give bread to everyone but since our $10,000 a month bread subscription includes unlimited bread, we obviously have no choice but to limit access to bread to a very small number of rich individuals. Really, we hate to see people starve but as I’ve just clearly laid out, our hands really are tied here. Thank you for your understanding!”
In revolutionary France, people have lost their heads for this kind of mental acrobatics.
“That’s pretty good, right? Until you notice this is Reddit” Thanks for making my day, Philip
12 days of AI Explained Postmas lets gooooooooooooooooo
You do such a great job of covering the caveats and subtleties of broader arguments/concepts
Imagine paying $200 a month, only to have o1 tell you “Sorry I can’t do that because of my guidelines”.
$2400 a year! and you dont even own it!
Can’t wait for some TTC or better yet even Test Time Training paradigms from the open source community
AI Explained was hallucinating with that tic-tac-toe