NVIDIA Cosmos – A Video AI…For Free!
❤️ Check out Lambda here and sign up for their GPU Cloud:
Cosmos platform:
Hugging Face models:
More:
📝 The paper "Cosmos World Foundation Model Platform for Physical AI" is available here:
📝 My paper on simulations that look almost like reality is available for free here:
Or this is the orig. Nature Physics link with clickable citations:
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Sundvall, Taras Bobrovytsky,, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here:
My research:
X/Twitter:
Thumbnail design: Felícia Zsolnai-Fehér –
#nvidia
Hold your papers!
🙌📜
More artificial intelligence news video
It’s time for AI to hold my papers for me
The truck transporting traffic lights was the first time I really thought about people trying to mess up autonomous vehicles. What if someone has a vinyl sticker of a pedestrian on the back of their car? What if someone throws confetti out their window? There have to be a bunch of scenarios that not every manufacturer will have thought of.
I predict that’s what the anti-AI terrorists have in store for us in the future. They already tried to crash computers of people using image generative AI.
We need to prepare to strike first, when the time comes.
That’s why we need lidar/radar and not only cameras (looking at you Elon…)
Just need to make them smarter to understand. It’s easy for a human to see it’s an obvious trick, so therefore it can easily be done with cameras alone.
There is a video of a dude wearing a shirt of a stop sign and it stopped some teslas
@@AndyMcBlanewhat you call “just” is the hard part which some of us see impossible (as of today) for certain scenarios
It feels like we’re looking into the imagination of a robot to understand how it perceives the world. Incredible time to be alive.
Hey there Károly! Great videos, as usual! But… could you please give us more old-school type videos about simulations, light transport stuff? Kind of getting bored with all the AI stuff, and I think a lot of people share this opinion here. I understand it’s the craze right now, but there are other interesting stuff out there that isn’t AI. Thank you for reading and I hope you have a great day!
Seems to me like he may just be really fascinated by AI, perhaps his interests changed and this is the topic he’d like to make videos about now?
I think with the recent advancement in AI, less papers about them are coming out / catches the attention, and I think even simulations and light transport are utilizing more and more AI as well.
You two both bring up valid points, and I am in no position to tell him what kind of content he should be creating.
And no doubt, AI can be fascinating and is an interesting and extensive topic. With all the new research, the amount of AI papers just trumps all the other things, and improvements in the field are truly astonishing.
Not to discredit the researchers; they are doing fantastic work. Nor do I want to discredit Károly’s work, as his videos are always full of great information, presented fantastically.
But I can’t help but shrug when I see a semi-realistic, temporally unstable picture, video, or word of what is, pretty much, a weighted average of reality.
Maybe it’s just AI fatigue, maybe it’s just me. Just thought of voicing my opinion and what I’ve seen in the comments here and there.
But it’s just that, an opinion. Whatever video Károly produces, I know it will be a good one that explains the topic well.
You are all too kind, thank you! Every now and then I try one of these simulation videos – they still exist! However, unfortunately very few of you Fellow Scholars are clicking them and Youtube does not recommend it to others. It has gotten to the point where I am not sure if we can keep on doing them. It breaks my heart and I really hope some kind of solution will present itself over time!
@@TwoMinutePapers Thank you for the explanatory response Károly!
I have suspected something like this could also be behind the scenes, but was wondering other reasons too.
I guess the algorithm does what the algorithm does and that’s truly the simplest and most obvious explanation. It’s completely understandable given how much of a hot topic AI/ML has become recently.
However, your videos keep me entertained, no matter the content, and it’s always pleasant to see a new upload notification pop up!
Have a great day, and thank you once again!
AI plays first traffic light rhythm game! love it! 1:47
film making has changed forever. you can now have creativity unlocked.
But we lock our creativity more and more behind term like « cultural appropriation », « stereotype » etc.
Super cool and the same approach that Comma AI is taking for their self driving car / robotics agents
The weights are really on HF!!! NVidia actually did something! 😆 Hunyuan looks better at a glance, but I’m excited to compare them. Cosmos is supposed to have better physics, so if nothing else, we should be able to do image -> Cosmos (physics) -> Hunyan (vid2vid) for detail.
I have worked in self-driving related projects in automotive. I have remarked that we should improve the algorithm to recognize “fake” or “unreal” signs and traffic lights and my proposal was rejected because it was unrealistic.
I ROFL-ed at the streaming traffic lights 😀 and that’s exactly what I was worried about.
Diving into the Cosmos platform. Physical AI just got real. Thanks for the share.
Another way to look at this is look at where we were 2 papers before this. This would’ve been unthinkable to attempt on consumer hardware.
what we need for perfect video AI is for the objects in the scene to be tracked as sub images or even 3D models instead of just as pixels. The first option prevents objects from disapearing for no reason and prevents them from merging together for no reason, the second option also prevents them from shapeshifting for no reason.
That’s literally what this is
@@MrTmansmooth no it isn’t? I might’ve missed it, but nowhere in the video did it say it does that
@ it literally is the keynote is like 3hrs did you watch it?
I’m looking forward to the day when an AI is released that can correct spoken text to make it sound like complete, well-formed sentences instead of just a series of words strung together.
“You have to wait for 5 minutes”
With my hardware, I waited longer for a 640×480 photo. 🙂
This is the better AI, it doesn’t steal Independent artist’s work
but they must make laws for the future about them
AI generated narration on this video?
If AI-created data is being used to train other AI, what are the measures in place to prevent errors in AI-created data from being passed down and tainting the other AI that is being trained?
The nvidia keynote explained it better but essentially there is a physics simulation engine under all this which grounds the model
people who fight for open sourcing things have my eternal respect
you can “run it at home for free” because you need their closed-source hardware to run it at home
this AI narrator is so “Wow” and “bang!”
What a time to be alive!