Meta’s LLAMA 4 AI In 4 Minutes!
❤️ Check out Lambda here and sign up for their GPU Cloud:
Guide for using DeepSeek on Lambda:
Or just run it with Ollama when the model appears:
📝 LLAMA 4:
📝 My paper on simulations that look almost like reality is available for free here:
Or this is the orig. Nature Physics link with clickable citations:
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Taras Bobrovytsky, Thomas Krcmar, Tybie Fitzhugh, Ueli GallizziIf you wish to appear here or pick up other perks, click here:
My research:
X/Twitter:
Thumbnail design: Felícia Zsolnai-Fehér –
Is that not a 4 minute paper then?
2 minutes per paper, 2 models->2*2=4 minutes
its also not a paper at all!
what are you talking about, he surpassed the 7 minute benchmark by almost a half.
1 minute per llama
oo new video 4 minutes ago 😀
4 minutes old!
The 10M token context length is an incredible leap, but it also raises new questions about how we structure prompts, process responses, and prioritize relevance over redundancy. Scaling comprehension isn’t just a hardware issue, it’s a design challenge.
@@AdvantestInc It is huge, but the trouble is it doesn’t seem very good at it, in third-party testing that’s not explicit very obvious needle in a haystack type testing it performs quite poorly compared to things like googles 2 million (Their 10 million not really being accessible). So I do think it’s an important caveat.
llama 4 is disappointing tho
Llama 4 sucks at coding
@@andrejg3086 it’s almost like watching the video would have been worth doing before you comment
What I always wanna know before investing the time, thanks!
i love this channel for showing me that llama 4 does have inovation even though it lack in other areas, i didnt even bother looking at llama4 because of what i heard about it.
You are too kind, thank you so much! 🙌📜
I feel like they should have waited to release the 288B parameter model first. The 10m Context Window is impressive on paper, but useless in application if the model itself isn’t very competent.
Now, people are just going to have a bad association with Llama 4, even if the later 288B context model is much better.
Wonderous times indeed
Just want to acknowledge the great work of the whole AI community for releasing so many resources in an accessible format. I am old enough to remember the first NVIDIA Titan cards and the powerful demos if facial animation but the pain of having not accessible tools to utilise the power. Now the level of technical skill required to use gpu power for artists is much lower because of the hard work of the community to release tools with easy user interfaces. So very much appreciated. Thanks everyone. Love your work. 👍🙏
Very true!
Good luck trying to run the new llama models on consumer hardware
Name change occurring soon “Papers in Minutessss”
Most ppl seem disappointed with Llama 4.
me over here sitting on my mac just waiting for a model better than gemma 12b that does’t shatter my gpu lol
What’s wrong with Apple Intelligence?
If you watch this video at 2x speed it becomes a Two Minute Paper XD
I watch everything at 2x speed. Normal speed sounds like slow motion now.
10M token context limit is amazing but Llama 4 is nowhere near as good at coding as its major competitors and generally feels like it’s a wip or a generation behind the A list. What a time to…… ditch Llama…… in favour of Gemini 3, o3, GPT4o, Claude 3.7, Grok 3 or Deepseek 3.
We are so spoilt for choice!
The mixture of experts approach seems good for most problems, however being an expert in (at least) two fields at once, is something that to some extent only an AI can really do, at least, in theory… It might struggle due to a lack of examples of people who are experts in say, cooking and music composition, if you want to know what piece of music to pair with your steak XD.
I use LLM’s extensively on my studies and Gemini 2.5 was a big step. You can send crash logs and troubleshoot your system. Discover new topics, practice on them. Many more possibilities that are yet to be discovered. We have personal asisstans now. Well this could be good times until someday stronger AI agent decides to take over planet to fix a bug in a code.
“Open” has always been the way to go. I grew up with Mandrake and Mandriva.
The context window is ten million tokens wide? Really? I wonder how much RAM is required to run this beast.
1:55 “Scout and Maverick fit on a single graphics card” –> cannot be true