GPT-4o Mini Arrives In Global IT Outage, But How ‘Mini’ Is Its Intelligence?
GPT 4o Mini from OpenAI is here, and Windows melts down. Coincidence? Definitely. But 100 million free users might soon be using it, so I’ve been scrutinising the model relentlessly since last night and will explain why OpenAI might need to be a bit more honest about the tradeoffs involved, and where they might head next, with reasoning, physical intelligence and more. Plus Fei-Fei Li, USMLE, and Roon.
Assembly AI Sign-up:
AI Insiders:
GPT-4o Mini:
Altman Tweet:
Roon:
Comparison:
DeepMind Physical Intelligence:
Paper:
Fei Fei Li, Spatial Intelligence:
Strawberry OpenAI:
Visual Intelligence Paper:
AGI Scale:
USMLE Video:
Question Source:
AI Insiders:
Non-hype Newsletter:
GenAI Hourly Consulting:
Need an GenAI app built for your business (any scale), in 4-8 weeks? My SF-based colleague Michael Lin, ex-Netflix + Amazon Senior Software Engineer – is now available for a free 30 min consultation: hello@allinengineeringconsulting.com
Nice
🤔
Completed your course on Coursera. Had a blast and learned a lot, thank you!
So kind Iakobus! Means a lot, thank you, was over a hundred hours of research and editing. Link for anyone interested: imp.i384100.net/m57g3M
Number 11 Viewer!!!
Welcome to Agent era 🎉
What defines that?
Fast, cheap LLMs
Gpt4o mini:
Input: 15cent/Million Token
Output: 60cent/Million Token
GPT4o:
$15 DOLLAR/Million Token
Been a while glad to see you back <3 I think everyone is just waiting for their AI WAIFU aka GPT 4-o advanced voice mode 🙂
haha
More content faster please.
(Your content on AI is the best out there).
Took a little break, as did the AI news it seems! But it came back with my return.
I think the interesting part about the “chicken nugget” example, is that these types of questions even stump humans. This might be why LLM’s struggle with them.
So far everyone I asked got it, but no model (benchmark has a slightly modified version).
Like you said, people lie and share their mistakes in written text, but learning based on the real world doesn’t lie. It would seem as long as we build these models to work like “humans”, we will find they come with the same deficits and blind spots that we have.
Phillip wants his chicken nuggets even in coma
@@aiexplained-officialThat question isn’t hard, but there are many “riddle” questions that commonly stump humans. And these models are closer to young children in their “intelligence”. I think it’s showing blind spots in human intelligence, which is being transferred to the models.
This is a good one that I think of, it does some times stump people to say 50 https://youtube.com/shorts/yRLjpFv5MQ0?si=L9avirBXt_4990oj
🙋
All AI language models are broken now.
Define broken?
It doesn’t matter if the model is only available via SaaS (Software as a Service).
One theory suggests there is a unitary system for RELATIONAL REASONING. This suggests that over time, from infancy to adulthood, this system develops. The essence of the adult system, known as structure mapping, is innate and present from the outset of development.
An opposing theory argues that we have multiple systems. Early Systems are tied to cognitive domains such as mental attribution. These systems don’t support high-level reasoning, but can produce behaviours that mimic it, to an extent. From around age 3 to adolescence, a Late System develops separately, which is domain-general. In adulthood, the Early and Late systems coexist: Early System outputs can be used by Late System in abstract forms
Thanks for the video! I wonder what profit per token is for businesses. I suspect it is mostly negative for now but improving rapidly.
I’m scared of the centralization of AI power and the companies trying to regulatory capture the market.
I find human greed and concentration of power to be the absolute biggest “danger of AI.”
What’s your thoughts on software jobs this year because I think AI progress is saturated now, What’s your take on capabilities of gpt 5
Thanks for new video!
I’ve tried it in my RAG experiment today. 10 pages of complex PDF text in the context. Compared to 4o, it produces non-impressive results. Also, it’s weak on precision work, like classification of sentences, or single word analysis like splitting a compound words into components. It seems to be trained on easy chatting. It beats 3.5 hands down, however.
I was just starting to wonder when the next ai explained video would drop, and here it is! 😃
“Where have I been for the last 39 versions!?” – gold.
Finding Anthropics new models math feels best in class currently – am paying for both
Good to see you post again I was worried you had got board 😉 as so little is going on!!! Love you content