NVIDIA Cosmos – A Video AI…For Free!

❤️ Check out Lambda here and sign up for their GPU Cloud:

Cosmos platform:

Hugging Face models:
More:

📝 The paper "Cosmos World Foundation Model Platform for Physical AI" is available here:

📝 My paper on simulations that look almost like reality is available for free here:

Or this is the orig. Nature Physics link with clickable citations:

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Sundvall, Taras Bobrovytsky,, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here:

My research:
X/Twitter:
Thumbnail design: Felícia Zsolnai-Fehér –

#nvidia

Joe Lilli
 

  • @rovalle5967 says:

    Hold your papers!

  • @MrQuickLine says:

    The truck transporting traffic lights was the first time I really thought about people trying to mess up autonomous vehicles. What if someone has a vinyl sticker of a pedestrian on the back of their car? What if someone throws confetti out their window? There have to be a bunch of scenarios that not every manufacturer will have thought of.

    • @wobber17 says:

      I predict that’s what the anti-AI terrorists have in store for us in the future. They already tried to crash computers of people using image generative AI.
      We need to prepare to strike first, when the time comes.

    • @WallabyWinters says:

      That’s why we need lidar/radar and not only cameras (looking at you Elon…)

    • @AndyMcBlane says:

      Just need to make them smarter to understand. It’s easy for a human to see it’s an obvious trick, so therefore it can easily be done with cameras alone.

    • @naftaliten7989 says:

      There is a video of a dude wearing a shirt of a stop sign and it stopped some teslas

    • @Omsip123 says:

      ⁠@@AndyMcBlanewhat you call “just” is the hard part which some of us see impossible (as of today) for certain scenarios

  • @TheAkdzyn says:

    It feels like we’re looking into the imagination of a robot to understand how it perceives the world. Incredible time to be alive.

  • @cyancoyote7366 says:

    Hey there Károly! Great videos, as usual! But… could you please give us more old-school type videos about simulations, light transport stuff? Kind of getting bored with all the AI stuff, and I think a lot of people share this opinion here. I understand it’s the craze right now, but there are other interesting stuff out there that isn’t AI. Thank you for reading and I hope you have a great day!

    • @can9660 says:

      Seems to me like he may just be really fascinated by AI, perhaps his interests changed and this is the topic he’d like to make videos about now?

    • @Kewl_Zomb says:

      I think with the recent advancement in AI, less papers about them are coming out / catches the attention, and I think even simulations and light transport are utilizing more and more AI as well.

    • @cyancoyote7366 says:

      You two both bring up valid points, and I am in no position to tell him what kind of content he should be creating.

      And no doubt, AI can be fascinating and is an interesting and extensive topic. With all the new research, the amount of AI papers just trumps all the other things, and improvements in the field are truly astonishing.

      Not to discredit the researchers; they are doing fantastic work. Nor do I want to discredit Károly’s work, as his videos are always full of great information, presented fantastically.

      But I can’t help but shrug when I see a semi-realistic, temporally unstable picture, video, or word of what is, pretty much, a weighted average of reality.

      Maybe it’s just AI fatigue, maybe it’s just me. Just thought of voicing my opinion and what I’ve seen in the comments here and there.

      But it’s just that, an opinion. Whatever video Károly produces, I know it will be a good one that explains the topic well.

    • @TwoMinutePapers says:

      You are all too kind, thank you! Every now and then I try one of these simulation videos – they still exist! However, unfortunately very few of you Fellow Scholars are clicking them and Youtube does not recommend it to others. It has gotten to the point where I am not sure if we can keep on doing them. It breaks my heart and I really hope some kind of solution will present itself over time!

    • @cyancoyote7366 says:

      @@TwoMinutePapers Thank you for the explanatory response Károly!

      I have suspected something like this could also be behind the scenes, but was wondering other reasons too.

      I guess the algorithm does what the algorithm does and that’s truly the simplest and most obvious explanation. It’s completely understandable given how much of a hot topic AI/ML has become recently.

      However, your videos keep me entertained, no matter the content, and it’s always pleasant to see a new upload notification pop up!

      Have a great day, and thank you once again!

  • @Wizartar says:

    AI plays first traffic light rhythm game! love it! 1:47

  • @tld8102 says:

    film making has changed forever. you can now have creativity unlocked.

    • @jonathaningram8157 says:

      But we lock our creativity more and more behind term like « cultural appropriation », « stereotype » etc.

  • @AndyMcBlane says:

    Super cool and the same approach that Comma AI is taking for their self driving car / robotics agents

  • @jonmichaelgalindo says:

    The weights are really on HF!!! NVidia actually did something! 😆 Hunyuan looks better at a glance, but I’m excited to compare them. Cosmos is supposed to have better physics, so if nothing else, we should be able to do image -> Cosmos (physics) -> Hunyan (vid2vid) for detail.

  • @I_am_who_I_am_who_I_am says:

    I have worked in self-driving related projects in automotive. I have remarked that we should improve the algorithm to recognize “fake” or “unreal” signs and traffic lights and my proposal was rejected because it was unrealistic.
    I ROFL-ed at the streaming traffic lights 😀 and that’s exactly what I was worried about.

  • @WinonaNagy says:

    Diving into the Cosmos platform. Physical AI just got real. Thanks for the share.

  • @DownwithEA1 says:

    Another way to look at this is look at where we were 2 papers before this. This would’ve been unthinkable to attempt on consumer hardware.

  • @lexibyday9504 says:

    what we need for perfect video AI is for the objects in the scene to be tracked as sub images or even 3D models instead of just as pixels. The first option prevents objects from disapearing for no reason and prevents them from merging together for no reason, the second option also prevents them from shapeshifting for no reason.

  • @coc1841 says:

    I’m looking forward to the day when an AI is released that can correct spoken text to make it sound like complete, well-formed sentences instead of just a series of words strung together.

  • @CharafEddineCHERAA says:

    “You have to wait for 5 minutes”
    With my hardware, I waited longer for a 640×480 photo. 🙂

  • @teorloges315 says:

    This is the better AI, it doesn’t steal Independent artist’s work
    but they must make laws for the future about them

  • @morphentropic says:

    AI generated narration on this video?

  • @vng says:

    If AI-created data is being used to train other AI, what are the measures in place to prevent errors in AI-created data from being passed down and tainting the other AI that is being trained?

    • @MrTmansmooth says:

      The nvidia keynote explained it better but essentially there is a physics simulation engine under all this which grounds the model

  • @chetangiradkar says:

    people who fight for open sourcing things have my eternal respect

  • @WiLLiW_oficial says:

    this AI narrator is so “Wow” and “bang!”

  • @NakedSageAstrology says:

    What a time to be alive!

  • >