Home →
AI →
NVIDIA’s New AI: Stunning Voice Generator!

NVIDIA’s New AI: Stunning Voice Generator!

❤️ Check out Weights & Biases and sign up for a free demo here:

📝 The blog post and paper are available here:

Voice isolation (with timestamp):

📝 My paper on simulations that look almost like reality is available for free here:

Or this is the orig. Nature Physics link with clickable citations:

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Sundvall, Taras Bobrovytsky,, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here:

My research:
X/Twitter:
Thumbnail design: Felícia Zsolnai-Fehér –

#nvidia

New Higgsfield Video AI Surprised Us!

Mindblowing o3 Prompts, OpenAI Models & More AI Use Cases

ChatGPT Remembers EVERYTHING About You Now 🤯

You Need to Try This New AI Agent (Genspark Super Agent)

OpenAI’s GPT 4.1 – Absolutely Amazing!

When You Combine 3 AIs, You Get THIS

Runway Finally Released Gen-4 Video AI 🤯

Massive Breakthrough in Understanding AI 🤯

Joe Lilli

@noisyninja_za6182 says:

November 26, 2024 at 11:03 am

What a time to be alive!

@Funkx2g says:

November 26, 2024 at 2:54 pm

Truly one of the most amazing times in history.

Reply
@joeeyaura says:

November 26, 2024 at 3:15 pm

someone has to post this comment every video

Reply
@mariolopez-oi2td says:

November 26, 2024 at 5:00 pm

My knuckles are white from holding onto my papers

Reply
@romanemul1 says:

November 26, 2024 at 5:07 pm

I hate his voice.

Reply

@NarrativeOfLifeM says:

November 26, 2024 at 3:00 pm

I honestly thought your sound was computer-generated because of the pause in almost every word. I learn something new every day! 😀

@elektra81516 says:

November 26, 2024 at 4:15 pm

This is just how he talks naturally

Reply
@Philosophaster says:

November 26, 2024 at 4:45 pm

@@elektra81516but if he gets replaced by an AI we’ll never know

Reply

@AGIzero00 says:

November 26, 2024 at 3:00 pm

Let’s just hope that Nvidia has the balls to actually release this

@jpgallegoar says:

November 26, 2024 at 3:06 pm

open source it*/ release the weights

Reply
@Matthew_Fog says:

November 26, 2024 at 3:12 pm

@@jpgallegoarexactly this, open source alllllll the way

Reply
@Pygon2 says:

November 26, 2024 at 4:52 pm

I’d guess that both Nvidia and Meta are going sit on these things until they see how the copyright lawsuits against Suno, etc pan out. Patents typically only stifle innovation for the benefit of the few “inventors” (who rarely make significant novel changes that would not have eventually been devised by others). In fact, the “first to file” is proof that is the case, as a number of inventors may be working on the same invention with the same ideas, but only the first to file the patent is credited and* “owns” the idea despite the efforts of others who may have been days away from, or had already arrived at, the same solution. As much as I understand protecting artists somewhat, copyright isn’t significantly better, especially with how much their artistry tends to appropriated as industry profits by corporations.

Reply
@eugeneputin1858 says:

November 26, 2024 at 5:31 pm

This isnt up to invidia at all

Reply
@jpgallegoar says:

November 26, 2024 at 5:33 pm

@@eugeneputin1858 how so? it’s their model

Reply

@Loctorak says:

November 26, 2024 at 3:01 pm

Create a sound of 100s of fellow scholars holding on to their papers.

@davescott7680 says:

November 26, 2024 at 3:01 pm

The funny thing is. You definitely could replace yourself with audio generation (if your not already), and even if got foind out because it generates something funky, you can be like “woah, you found it out! Good work, I got away with skiving off on a beach for 6 months before you guys got suspicious! Isn’t that incredible!?”. No backlash from AI usage.

@kirangouds says:

November 26, 2024 at 3:05 pm

Basically beating the Adobe audio to audio paper

@DeltaNovum says:

November 26, 2024 at 4:39 pm

Ive tried and paid for Adobe ai audio products. Not only are they far worse than free alternatives, Adobe’s upload and processing speeds are atrocious. It aint cheap either.

Reply

@RedRisotto says:

November 26, 2024 at 3:09 pm

I wish I was younger to see another 25-30 years of development and progress… However, it’s still a good time to be alive. 😉

In the next couple of years, I’m hoping for AI vector files THAT uses AI for optimal node placement and node reduction, proper curve smoothing and control, and accurate angles, and zero (stray node) artifacts. It should be doable right now, no!?

@AdvantestInc says:

November 26, 2024 at 3:13 pm

The demonstration of emotional nuance in synthesized speech is a game changer. Imagine the potential for storytelling and immersive experiences, AI is truly blurring boundaries here.

@thomaskrogh1244 says:

November 26, 2024 at 3:41 pm

Or thousands of incels and gooners create an artificial girlfriend/lover and go deeper into delusional mindscape.

Reply

@OpreanMircea says:

November 26, 2024 at 3:23 pm

what would I use this for? uhm…. adding sound to the AI generated porn of course, what else?

@Adrian-ep4qm says:

November 26, 2024 at 4:45 pm

“A person who thinks all the time has nothing to think about except thoughts”

Reply
@anandchoure1343 says:

November 26, 2024 at 5:20 pm

You are too stupid to think about the potential of it

Reply

@claudiusbuser says:

November 26, 2024 at 3:30 pm

Did I miss the link in the video description he is talking about at 4 minute mark? The one about the voice Isolation…

@TwoMinutePapers says:

November 26, 2024 at 3:36 pm

Yes, you are indeed right! Thank you and apologies – fixed it in the description. Posting the link here too: https://youtu.be/qj1Sp8He6e4?si=ZtSesU1e7jeoN55U&t=63

Reply

@mshonle says:

November 26, 2024 at 3:36 pm

It shouldn’t be too surprising that it might outperform specialist models on their specialty. In general I think the “scale is all you need” crowd has some blind spots, but I’d agree with them here. When it comes to instrumentation, music, isolation and denoising there are many ways to generate annotated synthetic audio data. I’m sure it cost an enormous amount to train this, but in terms of having a model grounded in music theory we’ve only just begun to scratch the surface.

@alexdavies7112 says:

November 26, 2024 at 3:42 pm

This seems cool but without it being publicly accessible, I don’t really see the point.

@hippopotamus86 says:

November 26, 2024 at 3:44 pm

2:10 Was that an error?

@empatikokumalar8202 says:

November 26, 2024 at 3:45 pm

I would love to use it right away. It would be even better if it was free to try.

@Gcrowan says:

November 26, 2024 at 3:56 pm

The introduction in the middle of the the video always feels so out of place, I keep thinking the video reset or auto played the next one by mistake.

@Asterrayx says:

November 26, 2024 at 4:26 pm

I can’t wait for 1,000’s of AI-Generated Song Slop in my feed!!!

@NoNameNeeded-u3r says:

November 26, 2024 at 4:43 pm

you sound more ai generated than fugatto haha

@SaigeSauce says:

November 26, 2024 at 4:52 pm

I love thinking of the fun uses for this…. but that gets dashed thinking of how much more AI slop is going to be taking over YouTube and other platforms 😭

@clerothsun3933 says:

November 26, 2024 at 5:15 pm

Y’all realise AIs that do better music than this have been around for over a year

Reply
@SaigeSauce says:

November 26, 2024 at 5:25 pm

@@clerothsun3933 “Better” is very subjective. I’d love to hear some of the songs that are your favorite, because sadly a lot of the AI ones feel really repetitive and then either super simple or way too complicated. Definitely willing to learn more! I also like the backstory to artists music journey, the inspiration behind the piece, and the passion that went into making it! AI has none of that (and never can) and will be taking so many opportunities away from budding artist that already have a hard time growing.

Reply

@TeddyLeppard says:

November 26, 2024 at 5:18 pm

Many years ago I recall seeing a video taken at Skywalker Sound (or ILM) and someone there was demonstrating the sound of a flute mixed with a voice. It was magical. And this was easily more than 20 years ago.

@DamianReloaded says:

November 26, 2024 at 5:37 pm

There are services able to generate complete songs with lyrics and vocals singing those lyrics now. I am still looking for the top of my head.

NVIDIA’s New AI: Stunning Voice Generator!

Related Posts

Joe Lilli