Meta’s LLAMA 4 AI In 4 Minutes!

❤️ Check out Lambda here and sign up for their GPU Cloud:

Guide for using DeepSeek on Lambda:

Or just run it with Ollama when the model appears:

📝 LLAMA 4:

📝 My paper on simulations that look almost like reality is available for free here:

Or this is the orig. Nature Physics link with clickable citations:

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Taras Bobrovytsky, Thomas Krcmar, Tybie Fitzhugh, Ueli GallizziIf you wish to appear here or pick up other perks, click here:

My research:
X/Twitter:
Thumbnail design: Felícia Zsolnai-Fehér –

Joe Lilli
 

  • @Speak_Out_and_Remove_All_Doubt says:

    Is that not a 4 minute paper then?

  • @errorhostnotfound1165 says:

    oo new video 4 minutes ago 😀

  • @howtoguy17 says:

    4 minutes old!

  • @AdvantestInc says:

    The 10M token context length is an incredible leap, but it also raises new questions about how we structure prompts, process responses, and prioritize relevance over redundancy. Scaling comprehension isn’t just a hardware issue, it’s a design challenge.

    • @quantuminfinity4260 says:

      @@AdvantestInc It is huge, but the trouble is it doesn’t seem very good at it, in third-party testing that’s not explicit very obvious needle in a haystack type testing it performs quite poorly compared to things like googles 2 million (Their 10 million not really being accessible). So I do think it’s an important caveat.

  • @nfaza80 says:

    llama 4 is disappointing tho

  • @andrejg3086 says:

    Llama 4 sucks at coding

  • @Nayo987 says:

    i love this channel for showing me that llama 4 does have inovation even though it lack in other areas, i didnt even bother looking at llama4 because of what i heard about it.

  • @BluishGreenPro says:

    I feel like they should have waited to release the 288B parameter model first. The 10m Context Window is impressive on paper, but useless in application if the model itself isn’t very competent.
    Now, people are just going to have a bad association with Llama 4, even if the later 288B context model is much better.

  • @hyphenpointhyphen says:

    Wonderous times indeed

  • @ThePhilosophiser says:

    Just want to acknowledge the great work of the whole AI community for releasing so many resources in an accessible format. I am old enough to remember the first NVIDIA Titan cards and the powerful demos if facial animation but the pain of having not accessible tools to utilise the power. Now the level of technical skill required to use gpu power for artists is much lower because of the hard work of the community to release tools with easy user interfaces. So very much appreciated. Thanks everyone. Love your work. 👍🙏

  • @mindful_clip says:

    Name change occurring soon “Papers in Minutessss”

  • @pandoraeeris7860 says:

    Most ppl seem disappointed with Llama 4.

  • @NotNotGrumm says:

    me over here sitting on my mac just waiting for a model better than gemma 12b that does’t shatter my gpu lol

  • @fsnuc says:

    If you watch this video at 2x speed it becomes a Two Minute Paper XD

  • @RickOShay says:

    10M token context limit is amazing but Llama 4 is nowhere near as good at coding as its major competitors and generally feels like it’s a wip or a generation behind the A list. What a time to…… ditch Llama…… in favour of Gemini 3, o3, GPT4o, Claude 3.7, Grok 3 or Deepseek 3.

    We are so spoilt for choice!

  • @GameCarpenter says:

    The mixture of experts approach seems good for most problems, however being an expert in (at least) two fields at once, is something that to some extent only an AI can really do, at least, in theory… It might struggle due to a lack of examples of people who are experts in say, cooking and music composition, if you want to know what piece of music to pair with your steak XD.

  • @donnyboi6061 says:

    I use LLM’s extensively on my studies and Gemini 2.5 was a big step. You can send crash logs and troubleshoot your system. Discover new topics, practice on them. Many more possibilities that are yet to be discovered. We have personal asisstans now. Well this could be good times until someday stronger AI agent decides to take over planet to fix a bug in a code.

  • @WizzCricket says:

    “Open” has always been the way to go. I grew up with Mandrake and Mandriva.

  • @Circuit_Woods_Tales says:

    The context window is ten million tokens wide? Really? I wonder how much RAM is required to run this beast.

  • @jonathanozik5442 says:

    1:55 “Scout and Maverick fit on a single graphics card” –> cannot be true

  • >