• Home
  • AI

o1 Pro Mode – ChatGPT Pro Full Analysis (plus o1 paper highlights)

Oh boy. o1 pro mode out on the same night as o1 full. I read the 49 page paper, ran my own tests, spent my fuel allowance on Pro Mode and will give you all the highlights. Suffice to say the story is not as simple as it first appears.

Weights and Biases’ Weave: wandb.me/ai_explained

Plus, GPT-4.5? MLE Bench, Simple Update, Image Analysis and much more

AI Insiders:

o1 System Card:
Apollo Research:
Altman Tweet:
ChatGPT Pro:
Tibor Blaho:
Simple-bench.com

Chapters:
00:00 – Introduction
00:27 – ChatGPT Pro is $200
01:25 – OpenAI Benchmarks
03:20 – o1 System Card, o1 and o1 Pro Mode vs o1-preview
06:18 – Simple Bench surprising results on sample
08:31 – Weight & Biases
09:05 – Image Analysis Compared
12:51 – More Benchmarks and Safety

The 8 Most Controversial Terms in AI:

Non-hype Newsletter:

Podcast:

I use Descript to edit my videos:

Many people expense AI Insiders for work. Feel free to use the Template in the 'About Section' of my Patreon.

Joe Lilli
 

  • @roykent2316 says:

    Congrats on 300k!! 🎉 Well deserved ❤

  • @Maximillian_Space says:

    oh boy, can’t wait for Pro Mode+ for only $1999.99 per month!

    • @somethingclever4297 says:

      Which will be the most miniscule improvement.

    • @obetishere9215 says:

      I think it comes with the robot

    • @TheGreatestJuJu says:

      Yup, only crappy models are cheap. When Ai can actually change the world it will be behind a paywall so steep only very rich can afford it. The masses are subsidizing the cost of training their replacements.

    • @jairolivares3015 says:

      literally saving up at this moment. i WILL be ready to drop 2k and make my money back. real hustlers know

    • @therainman7777 says:

      You’re joking, but I actually can’t wait for that. If they eventually have models smart enough to justify spending $2k/mo I will be their first customer. People underestimate the value if intelligence to an amazing degree

  • @unvergebeneid says:

    I was naively under the impression that being an OpenAI subscriber would give me access to the newest and coolest models. lol.

    • @DanielSeacrest says:

      Well, if o1 pro mode is just majority voting utilising o1, then you do technically have the newest model o1 still (full o1 has rolled out to all plus/team/pro users now) just not in the exact same setup lol.

    • @logan27000 says:

      If I’m understanding how pro works, you should be able to iterate o1 results on themselves with reflective prompting and be able to get an answer approaching the pro result.

    • @thedrivetosuccess7389 says:

      you still get access to o1

    • @unvergebeneid says:

      @@logan27000 but usually you only get a certain number of calls to the more advanced models. So a DIY pro model might be anything from impractical to impossible.
      I mean look, I’m not mad or anything. It’s the symbolism of it all that I think is more of a problem for OpenAI than it is a problem for me to not get the super wasteful mode of their best model.

    • @therainman7777 says:

      @@logan27000 I actually don’t think that’s how o1-pro works, but none of us have any real evidence yet so I can’t say for sure.

  • @TheLordMarty says:

    Thanks for always bringing the edge news in GenAI! Even as an NLP researcher, I’m finding it difficult to keep myself updated on all the news around LLMs, so these are a huge help!!

  • @rechington says:

    your suggested tic tac toe move is just a symmetry of the ai suggestion…

  • @aperson-ep8rl says:

    10:50 im pretty sure theres no difference in picking bottom left or top right they are both losing lol

  • @K.F-R says:

    Nothing will drive open source innovation more than a $200/month competitor with no moat.

    • @lowruna says:

      “open source competitor” that has a 500 Million server farm? how? 2x RTX 5000 to run good models locally… ? lol

    • @thr0w407 says:

      @@lowruna It’s only efficiencies in the way until it runs locally. The first computers were huge like this.

    • @Charles-Darwin says:

      ​@@lowruna none of know the compute demands of o1. It might still be doable in lower grade hardware, it’s just a matter of cracking the architecture. At least in theory. OpenAI might be pricing up 2000% for the novelty and exclusivity. We don’t know

    • @therainman7777 says:

      It’s crazy how they’ll continue being successful and make millions of dollars despite your extremely witty sarcastic criticism.. It’s amazing how people always think they know best. I work at a Fortune 500 company and we held a meeting a few hours ago already discussing how to quickly enroll my entire department of 700+ people into ChatGPT Pro. Just because you don’t personally find it worthwhile to pay for doesn’t mean no one else does.

    • @lowruna says:

      @@thr0w407 and it took 50 years till now

  • @unvergebeneid says:

    Not only does it cost as much as heating your house in winter, I’m pretty sure if those servers were situated in your living room, they _would_ heat your house in winter!

    • @GeekProdigyGuy says:

      with how much compute it takes, probably more like set the house on fire

    • @bossgd100 says:

      ​@@GeekProdigyGuywith how much money it costs, probably more like set fire in your bank account

    • @ML_Machine_Learning says:

      How much “playing around” with technology like AI is legitimate by considering the sheer amount of energy consumption needed for computing? Would be a nice feature of OpenAI as well as all the other vendors to show the consumption of a single prompt as well as the overall energy consumption ⚡♻.

    • @cholst1 says:

      @@ML_Machine_Learning Can extrapolate this to the entire web to be fair. So many sites are so much more energetically heavy than they need to. The amount of terrible javascript out there is absolutely mind boggling. Websites as well should have a CO2 / Energy use stat on them.

    • @andybrice2711 says:

      People are actually starting to combined these functions. Using data-centres to heat buildings and swimming pools. Or compute nodes to heat homes. It makes a lot of sense.

  • @Fs3i says:

    10:30 The tic-tac-toe is symmetrical, so I don’t see the difference between the thing that you’d pick (bottom left corner) and the AI answer you state (top right corner).

    In fact, circle needs to take a side position (to force X to block), not the corner you indicate with your mouse. Then, it naturally ends in a draw.

  • @Artorias920 says:

    This is the review I was waiting for! Thanks for the hard work!

  • @HenrikoMagnifico says:

    ChatGPT 4 Pro Max Ultra (Titanium)

  • @DrBulbulia says:

    $200? No thanks, I’m happy with Claude. Thanks for the evaluation.

    • @vectoralphaSec says:

      It’s $20/ month. Only Pro mode is $200/ month.

    • @MattGreenfield says:

      To get equivalent “unlimited” use with Claude you need to pay for a $150/month Team plan.

    • @hongdouliu4381 says:

      @@MattGreenfield Do you pay for both? I do. Claude is more limited comparing to GPT plus subcribtion. I rarely run out out gpt 4o usage.

    • @I_Blue_Rose says:

      It’s alright, you were never the target audience in the first place.

    • @MattGreenfield says:

      @@hongdouliu4381 I pay $150/month for Claude Team plan and $20/month for ChatGPT Plus. 

      I use Claude for work all day (so probably 4-6 hours sustained use each day), and use ChatGPT Search for smart web search and GPT-4o Advanced Voice Mode for getting lectures on interesting topics when I go for walks or runs.

      So basically: Claude for when I need the smartest brain, and ChatGPT for its clever features that Claude doesn’t have yet. So far o1-preview hasn’t proven itself to the be the smartest brain overall – Claude still wins. Will be interesting to see how proper o1 does, but I’m expecting Claude to still come out on top.

  • @TheBuzzati says:

    Thanks for bringing everyone down to earth with this realistic assessment.

  • @b130610 says:

    4:25 “until you realize that this is reddit” lmao

  • @rccsab says:

    Most people are misunderstanding the new price. Yes, the model is NOT 10x better, but you have unlimited access. They simply cannot give everyone unlimited access with their current infrastructure, so the solution they found is to gate keep it behind a price that is inaccessible for the majority of people.

    • @verigumetin4291 says:

      Did you just ask people to have critical thinking?
      The people you are appealing for understanding lack critical thinking, therefore you just wasted your time.

    • @danialbka7790 says:

      why not make a middle tier then? along with pro plan. 100 messages per week at 5 – 10$ more. would make it not too restrictive.

    • @MattGreenfield says:

      Similarly, to get unlimited use of Claude Sonnet 3.5 I’ve had to upgrade to a Claude “Team” plan, at $150/month.

      To get that unlimited use I have to switch between the [required minimum] five “team member” accounts in the Team plan every couple of hours throughout the day, as each one gets rate limited, logging out of one then in to another.

      So Anthropic are kind of already offering a similar “unlimited” plan at a similar price, albeit with awkward login switching required.

    • @ConnoisseurOfExistence says:

      Is it really unlimited?

    • @unvergebeneid says:

      @@rccsab what kind of logic is that? “Oh sorry, we would like to give bread to everyone but since our $10,000 a month bread subscription includes unlimited bread, we obviously have no choice but to limit access to bread to a very small number of rich individuals. Really, we hate to see people starve but as I’ve just clearly laid out, our hands really are tied here. Thank you for your understanding!”
      In revolutionary France, people have lost their heads for this kind of mental acrobatics.

  • @Drengodr says:

    “That’s pretty good, right? Until you notice this is Reddit” Thanks for making my day, Philip

  • @pathaleyguitar9763 says:

    12 days of AI Explained Postmas lets gooooooooooooooooo

  • @booshong says:

    You do such a great job of covering the caveats and subtleties of broader arguments/concepts

  • @TechnoMinarchist says:

    Imagine paying $200 a month, only to have o1 tell you “Sorry I can’t do that because of my guidelines”.

  • @boas_ says:

    AI Explained was hallucinating with that tic-tac-toe

  • >