NVIDIA’s New AI: Stunning Voice Generator!
Check out Weights & Biases and sign up for a free demo here:
The blog post and paper are available here:
Voice isolation (with timestamp):
My paper on simulations that look almost like reality is available for free here:
Or this is the orig. Nature Physics link with clickable citations:
We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Sundvall, Taras Bobrovytsky,, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here:
My research:
X/Twitter:
Thumbnail design: Felícia Zsolnai-Fehér –
#nvidia
What a time to be alive!
Truly one of the most amazing times in history.
someone has to post this comment every video
My knuckles are white from holding onto my papers
I hate his voice.
I honestly thought your sound was computer-generated because of the pause in almost every word. I learn something new every day!
This is just how he talks naturally
@@elektra81516but if he gets replaced by an AI we’ll never know
Let’s just hope that Nvidia has the balls to actually release this
open source it*/ release the weights
@@jpgallegoarexactly this, open source alllllll the way
I’d guess that both Nvidia and Meta are going sit on these things until they see how the copyright lawsuits against Suno, etc pan out. Patents typically only stifle innovation for the benefit of the few “inventors” (who rarely make significant novel changes that would not have eventually been devised by others). In fact, the “first to file” is proof that is the case, as a number of inventors may be working on the same invention with the same ideas, but only the first to file the patent is credited and* “owns” the idea despite the efforts of others who may have been days away from, or had already arrived at, the same solution. As much as I understand protecting artists somewhat, copyright isn’t significantly better, especially with how much their artistry tends to appropriated as industry profits by corporations.
This isnt up to invidia at all
@@eugeneputin1858 how so? it’s their model
Create a sound of 100s of fellow scholars holding on to their papers.
The funny thing is. You definitely could replace yourself with audio generation (if your not already), and even if got foind out because it generates something funky, you can be like “woah, you found it out! Good work, I got away with skiving off on a beach for 6 months before you guys got suspicious! Isn’t that incredible!?”. No backlash from AI usage.
Basically beating the Adobe audio to audio paper
Ive tried and paid for Adobe ai audio products. Not only are they far worse than free alternatives, Adobe’s upload and processing speeds are atrocious. It aint cheap either.
I wish I was younger to see another 25-30 years of development and progress… However, it’s still a good time to be alive.
In the next couple of years, I’m hoping for AI vector files THAT uses AI for optimal node placement and node reduction, proper curve smoothing and control, and accurate angles, and zero (stray node) artifacts. It should be doable right now, no!?
The demonstration of emotional nuance in synthesized speech is a game changer. Imagine the potential for storytelling and immersive experiences, AI is truly blurring boundaries here.
Or thousands of incels and gooners create an artificial girlfriend/lover and go deeper into delusional mindscape.
what would I use this for? uhm…. adding sound to the AI generated porn of course, what else?
“A person who thinks all the time has nothing to think about except thoughts”
You are too stupid to think about the potential of it
Did I miss the link in the video description he is talking about at 4 minute mark? The one about the voice Isolation…
Yes, you are indeed right! Thank you and apologies – fixed it in the description. Posting the link here too: https://youtu.be/qj1Sp8He6e4?si=ZtSesU1e7jeoN55U&t=63
It shouldn’t be too surprising that it might outperform specialist models on their specialty. In general I think the “scale is all you need” crowd has some blind spots, but I’d agree with them here. When it comes to instrumentation, music, isolation and denoising there are many ways to generate annotated synthetic audio data. I’m sure it cost an enormous amount to train this, but in terms of having a model grounded in music theory we’ve only just begun to scratch the surface.
This seems cool but without it being publicly accessible, I don’t really see the point.
2:10 Was that an error?
I would love to use it right away. It would be even better if it was free to try.
The introduction in the middle of the the video always feels so out of place, I keep thinking the video reset or auto played the next one by mistake.
I can’t wait for 1,000’s of AI-Generated Song Slop in my feed!!!
you sound more ai generated than fugatto haha
I love thinking of the fun uses for this…. but that gets dashed thinking of how much more AI slop is going to be taking over YouTube and other platforms
Y’all realise AIs that do better music than this have been around for over a year
@@clerothsun3933 “Better” is very subjective. I’d love to hear some of the songs that are your favorite, because sadly a lot of the AI ones feel really repetitive and then either super simple or way too complicated. Definitely willing to learn more! I also like the backstory to artists music journey, the inspiration behind the piece, and the passion that went into making it! AI has none of that (and never can) and will be taking so many opportunities away from budding artist that already have a hard time growing.
Many years ago I recall seeing a video taken at Skywalker Sound (or ILM) and someone there was demonstrating the sound of a flute mixed with a voice. It was magical. And this was easily more than 20 years ago.
There are services able to generate complete songs with lyrics and vocals singing those lyrics now. I am still looking for the top of my head.