GPT-4o – Full Breakdown + Bonus Details

@skylineuk1485 says:

May 13, 2024 at 10:14 pm

The emotional expression is amazing.

Reply

@GethinColes says:

May 13, 2024 at 10:45 pm

It’s amazing but I imagine it will get irritating very quickly. I found this with bing, the fake friendliness was grating.

Reply
@berserker912 says:

May 13, 2024 at 10:51 pm

It sounded like some person from a corporate environment with fake friendliness and toxic positivity. I found it nauseating tbh.

Reply
@Juttutin says:

May 13, 2024 at 11:17 pm

Yup. I was utterly over it by the end of their announcement video. Those voices combined with that attitude was so grating.

It’s like bad amdram.

Reply
@utkua says:

May 13, 2024 at 11:22 pm

it adds some
emotional indicators for the tts to interpret. this is not impressive for a LLM

Reply
@urhot says:

May 13, 2024 at 11:41 pm

@@GethinColes you can obviously prompt it to your liking, you must be new to AI.

Reply

@davidt0504 says:

May 13, 2024 at 10:15 pm

Don’t care about OpenAIs presentation. Been waiting for @AIExplained’s breakdown.

Reply

@aiexplained-official says:

May 13, 2024 at 10:42 pm

:))

Reply
@MindFieldMusic says:

May 14, 2024 at 3:45 am

Same 😁

Reply
@kingki1953 says:

May 14, 2024 at 3:51 am

the hero we need in AI distruption

Reply
@ginogarcia8730 says:

May 14, 2024 at 7:00 am

yezzirrrr

Reply

@DaveShap says:

May 13, 2024 at 10:15 pm

0:27 “more flirtatious sigh than AGI” bro I think you drastically overestimate the threshold that will satisfy most users. That was close to ScarJo levels of sensual breathiness…

Reply

@lesliejohnrichardson says:

May 13, 2024 at 10:31 pm

Love to see Mr. Shapiro himself commenting on a video of this equally wonderful AI/4IR channel

Reply
@lesliejohnrichardson says:

May 13, 2024 at 10:32 pm

PS: This is a damn impressive announcement/set of demos

I am extremely excited to see what a beast GPT-5 will be compared to everything else

Reply
@WillyJunior says:

May 13, 2024 at 10:33 pm

I thought he said “flirtatious sci” 😂

Reply
@williamlancaster9996 says:

May 13, 2024 at 10:37 pm

My brain parsed it as akin to, “instead of AI, this is more like ‘Flirtatious’ I”.

Reply
@JezebelIsHongry says:

May 13, 2024 at 10:51 pm

I got more

>>watch all the clips and focus on gpto.

they are using like a mod of Sky. or the emotive inflections, the quirks is so powerful its the same voice. go use Sky now and compare

watch Her

now watch all the clips

it “feels” like they essentially trained the model on Sam. the Sky voice has always sounded like a version of Samantha to me but now….its like the last instruction of the system prompt was

“you will perma-larp as Samantha from the movie Her.”

the fact that this is free is hard for me to contemplate.

it may not make sense to you, you may live in a warm family and have a great life. but there are millions of people who sit in quiet rooms, who fill the hours with distraction to mask the loneliness

you know what’s back?

Magic Pixie Dream Girl

Reply

@addeyyry says:

May 13, 2024 at 10:17 pm

Wake up, Her dropped

Reply

@mackblack5153 says:

May 13, 2024 at 11:17 pm

Profound…

Reply
@carlangaz007 says:

May 14, 2024 at 12:04 am

Looooool

Reply
@carlangaz007 says:

May 14, 2024 at 12:05 am

Not that far fetched now as compared to when the movie came out, Is it?

Reply
@nigelthornberry5375 says:

May 14, 2024 at 12:24 am

Wake up, my girlfriend dropped lmao

Reply
@markypops821 says:

May 14, 2024 at 3:14 am

Don’t know this but I’m curious. Her?

Reply

@nekony3563 says:

May 13, 2024 at 10:23 pm

Integration is the next GPT-moment. Being able to talk to AI at any point in time and show it your screen, and for it to being able to respond and click/press buttons. This will be transformative by itself.

Reply

@Steve-xh3by says:

May 13, 2024 at 11:36 pm

Too bad it is only available on phones and Mac. I have a subscription, and access to the model, but no voice option through the Win desktop web interface. I do all of my computing on desktop, so totally useless for me.

Reply
@nekony3563 says:

May 13, 2024 at 11:43 pm

@@Steve-xh3by The new voice and video is going to be available in coming weeks. Today is only the GPT-4o itself. I bet Win version will follow. Not sure about Linux version.

Reply
@Steve-xh3by says:

May 13, 2024 at 11:51 pm

@@nekony3563 I read they already confirmed no Win version of desktop app and voice only on mobile/Mac?

Reply
@Steve-xh3by says:

May 13, 2024 at 11:53 pm

@@nekony3563 It is available on my Android version already, and I read it wasn’t going to be available through WIndows.

Reply
@shivamguchhait says:

May 14, 2024 at 12:29 am

@@Steve-xh3bynot surprised, windows user are not gonna be their main target when all the business and the guys who willing to spend or use ai on their things uses Apple, until it’s about pc gaming or very high task work.

Reply

@Kags says:

May 13, 2024 at 10:28 pm

The way it joined in laughing at its own mistakes at 8:25 is absolutely stunning

Reply

@jay_sensz says:

May 13, 2024 at 10:51 pm

That’s what stood out to me the most, too. It’s easy enough to treat the ChatGPT text interface like a sophisticated yet lifeless machine. But when you can interact with it over voice like this and it picks up on social cues, displays emotion, etc, it gets pretty hard not to anthropomorphize it.

Reply
@WoolyCow says:

May 13, 2024 at 10:59 pm

i would use the word ‘worrying’…like the tech is amazing and the way it can incorporate pause fillers like ‘umm’, laughter and other phatic pleasantries is a testament to the data their using and fine tuning…but holy moly is this gonna cause soooo many parasocial relationships

we thought character ai was bad, now that with emotion is gonna royal screw up some people

Reply
@Bezimienny1598 says:

May 13, 2024 at 11:00 pm

@@jay_senszSo hard! I’m making a promise to myself at this point to not use these advanced voice chatbots because I KNOW I would become fond of them.

Reply
@eddiedoesstuff872 says:

May 13, 2024 at 11:03 pm

@@WoolyCowyeaaahhh, as soon as I heard the voice model I knew that someone was gonna fall in love with it eventually

Reply
@theterminaldave says:

May 13, 2024 at 11:05 pm

I also loved the super fast “123456789,10” lol that killed me.

Reply

@Gerlaffy says:

May 13, 2024 at 10:31 pm

The part at about 12:00 is amazing but when he turns on the camera… Wow. We’re close to AGI in terms of actual believability. It’s so organic and flows so humanly.

Reply

@bilbo_gamers6417 says:

May 13, 2024 at 11:04 pm

nah bro you don’t understand it will only become AGI when it speaks and acts absolutely authoritatively and is completely infallible and can answer any question and do independent high level physics research that completely changes the entire technological landscape in a matter of days after being introduced and can calmly morph unknowable questions between its fingers like putty and can tell you if God is real or how to get a gf

Reply
@games4us132 says:

May 14, 2024 at 5:00 am

This part reminds me of movie “her” with Scarlet Johansson

Reply
@hydrohasspoken6227 says:

May 14, 2024 at 1:35 pm

We are not. Sorry.

Reply
@Gerlaffy says:

May 14, 2024 at 1:44 pm

@@hydrohasspoken6227 I’m terms of *believability*, you think we’re not?

Reply
@hydrohasspoken6227 says:

May 14, 2024 at 1:49 pm

@@Gerlaffy , not by a long shot. But the new features are definitely cool. But just that.

Reply

@p5rsona says:

May 13, 2024 at 10:42 pm

never been this floored by ai…I dont know how some people not impressed by this. you have an ai that talks EXACTLY like a real human, emotions and all and can see so accurately via camera…im speechless here.

Reply

@zrakonthekrakon494 says:

May 13, 2024 at 11:40 pm

To me it feels like it’s trying to copy her too much, it feels inauthentic to me since it’s a copy

Reply
@Steve-xh3by says:

May 13, 2024 at 11:40 pm

Well, given that the voice option is only on phones and Mac, many of us can’t even make use of it. I do all my computing on a Windows desktop. I hardly ever use a phone for anything. I’m a retired software engineer, when you get older, phones are awful do to size/old eyesight. Plus, do young people actually use phones for productivity?

Reply
@K9Megahertz says:

May 14, 2024 at 12:22 am

Not impressed because it for the most part just regurgitates things that it learned from humans. If it could come up with stuff on its own, that would be impressive, but that’s just a limitation of how LLM’s work.

Don’t get me wrong, it’s neat stuff and has it’s uses, but I don’t think it really rises to the level of hype that it gets.

As far as programming goes, it still can’t come up with correct and working solutions to some of my test questions. Why? because it probably was never trained on the code that would have had to be written for it to be able to regurgitate it. That code and the working solution, while not complex or complicated by any means (at least for a 3d graphics programmer) is just very scarce in terms of documentation. Something I and and a few other programmers worked on in the early days of 3D engines back when BSP type engines like Quake were mainstream. I think ID Software’s implementation was a bit different than the approach we used so it wouldn’t have been in the quake source that was released.

For simple programs like hey sort a list of temperatures and print out the top 12 results and programs of the like, yeah, it can handle stuff like that. It’s seen umpteen million different versions of the code probably in it’s training set.

Reply
@mattmaas5790 says:

May 14, 2024 at 12:27 am

The movie Her did not invent flirtatious women.

Reply
@reza2kn says:

May 14, 2024 at 12:42 am

it’s a good time to be speechless, huh?

Reply

@oo__ee says:

May 13, 2024 at 10:44 pm

You may have predicted Her-like AI a month ago but Her predicted it a decade ago!

Reply

@aiexplained-official says:

May 13, 2024 at 10:44 pm

Haha so true

Reply
@countofst.germain6417 says:

May 13, 2024 at 11:46 pm

Also I think a ton of people predicted it as soon as the voice feature was originally released.

Reply
@eirikgg says:

May 13, 2024 at 11:56 pm

I realy do Wonder What kind of Voice conversations they have trained on. Its so expressive in the «feel» No Voice in API access yet so realy Wonder how and if you could turn down the knob abit or if the voice engagement reflects the users input. I’m not disappointed at all that there was no next level pure llm improvement now. Voice in / voice out is going to change how we interact. I just see how hard my kid at 8 is trying to get Siri to understand him and what more he expects and doesn’t get. If I understand this correctly this isn’t tts and speech to text. And that is huge!

Reply
@davidlovesyeshua says:

May 14, 2024 at 1:03 am

He did predict it arriving specifically in 2024 if I recall correctly

Reply
@mooing90 says:

May 14, 2024 at 1:55 am

Pp

Reply

@vivekparmar7576 says:

May 13, 2024 at 11:32 pm

The audio cutting in and out during the demo was most likely a feature where you can interrupt the AI in the middle of its speech. So while it is talking and it hears you speak it immediately stops talking, which is what we saw during the demo. Just a guess.

Reply

@ukaszgandecki9106 says:

May 14, 2024 at 10:44 am

Well, duh! The problem isn’t just that it cuts in and out. It’s how sudden, unnatural (non-human-like), and poorly timed these interruptions are. Issues like these keep you on your toes—instead of conversing as freely as you would with a person, you find yourself constantly adjusting your speech. For instance, you try to avoid lengthy pauses. I’m eager to test it soon, but I’m really hoping for further improvements.

Reply
@crubs83 says:

May 14, 2024 at 5:05 pm

@@ukaszgandecki9106 These are still some amazing strides in humanlike AI interactions. We went from a spooky-good text generator to an AI that you can have full vocal conversations with in 1.5 years. Yeah, it’s going to need to learn what sounds appropriately qualify as “interruptions,” but I expect to see huge strides on that front in the upcoming year.

Reply
@jonnicholasiii2719 says:

May 14, 2024 at 6:19 pm

@@ukaszgandecki9106 This is the worst this will ever be.

Reply
@TheAnthonyMarlowe says:

May 15, 2024 at 6:40 am

Except… it can also see. So it will just wait for you to actually finish now. If you’d actually used this the entire way you’d know this is pure magic compared to what it was and still is publicly.

Reply
@e4Bc4Qf3Qf7 says:

May 18, 2024 at 10:58 pm

@@ukaszgandecki9106are you saying as a human you don’t constantly interrupt and get interrupted by others? Thats just human speech unless your speaking in a very formal manner

Reply

@mickelodiansurname9578 says:

May 14, 2024 at 12:19 am

Ilya was booked to be there but at the last moment they discovered that the chain attached to his leg in the OpenAI dungeons wouldn’t stretch to the conference room!

Reply

@TheRealUsername says:

May 14, 2024 at 2:00 am

Joke apart I’m concerned, he disappeared for quite a while now.

Reply
@MrNote-lz7lh says:

May 14, 2024 at 3:44 pm

@@TheRealUsername
Well he got roasted to hell and back. He probably just want to stay out of the limelight.

Reply
@mickelodiansurname9578 says:

May 14, 2024 at 5:03 pm

@@TheRealUsername I think he was told “Head down and nose out!” and he’s doing just that. Someone clearly has something on Ilya, but he always seemed to me to be rather introverted anyway. It was often painful watching him being interviewed becasue he looked like a rabbit caught in the headlights.

Reply
@akmonra says:

May 17, 2024 at 1:33 am

and then he broke loose!

Reply
@akmonra says:

May 17, 2024 at 1:34 am

ilya used demo day as a diversion to escape. jan leike, who was tasked with guarding the basement, had to resign for his failure.

Reply

@harnageaa says:

May 14, 2024 at 12:52 am

About the intruder part (bunny years). That wasn’t him telling gpt “hey was there someone”, Sure he has to instruct the gpt to tell who was in the background, but the capability, was showcasing video memory.

It’s been 1 minute and gpt still remembered there was a person there. That’s the showcase.

Reply

@marcinhou says:

May 14, 2024 at 12:59 am

if they dont want to maximize engagement, one thing they missed out is the ability to stop the conversation just by the conversation ending like a ‘thank you for now’ without having to press the button, that would also just add a nice touch ux wise

Reply

@user-fr2jc8xb9g says:

May 14, 2024 at 5:07 am

i think you can it’s just faster and easier to click a button…

Reply

@noone-ld7pt says:

May 14, 2024 at 1:03 am

The latency combined with the emotional understanding are for me the game-changers here. I’ve been using GPT voice mode for a while for language practice and the delay has just never felt even close to a natural conversation, but this looks to possibly completely eliminate that issue in a single leap.

I honestly didn’t think we would have natural conversational capabilities until we could run very good models locally on device for the essentially zero latency I thought was needed. But if this demo can be replicated anywhere with decent service then it’ll be extremely interesting to see if it manages to completely leap across the uncanny valley or if this is gonna feel very eerie and dystopian.

The laughing, stuttering and excitement just sounded so damn good in the demo. We might be getting damn close to HER territory, and I think anthropomorphizing is gonna go of the charts with this. I mean one on the top comments on one of the demo videos was already along the lines of :
“There is NO way this thing is not sentient!”

Next few months are gonna be so damn interesting!

Reply

@BubbleTea033 says:

May 14, 2024 at 7:27 am

When she says “Sorry guys, I got carried away there and started talking in French.” at 8:25.

Just… just listen to how personable she sounds. GPT 4o is really something else. It’s not just the clear voice. It’s the laugh-talking. It’s the breath. It’s the accent that kind of slips out in “away there”, and the choice to use more casual and conversational language like saying “talking in French”, instead of “speaking French”. The embarrassed tone. And then the attempt afterwards to drum up excitement to try again. It’s so personable. I think that’s the right word. It feels human, which is great, and terrifying all the same.

Reply

@MustangDesudiroz says:

May 14, 2024 at 8:24 am

Ikr

Reply
@bloodust7356 says:

May 14, 2024 at 8:31 am

Actually that french line was so natural, felt like a real person talking. I mean it was not talking like if it was just reading something, but really how a native would talk in a casual conversation, that’s crazy.
As an exemple, you would write “je ne sais pas” but a native would say “j’sais pas” or “ché pas”.

Reply
@EnigmaticEsoteric says:

May 14, 2024 at 3:00 pm

There’s nothing terrifying about it, that’s language we should avoid with ai.

Reply
@wyqtor says:

May 14, 2024 at 3:05 pm

Also the very humanlike post-hoc rationalization 🥰

Reply
@mathisd says:

May 14, 2024 at 3:42 pm

It possibly is training leakage. As French that could very much be coming from french radio / podcast ?

Reply

@galrozental3332 says:

May 14, 2024 at 8:07 am

The stuttering at 12:51 is so human-like
“I, I mean you, you’ll definitely stand out”
Amazing.

Reply

@akmonra says:

May 17, 2024 at 1:29 am

that’s what i hate about it. they’re giving ai our human flaws.

Reply
@EchoMountain47 says:

May 19, 2024 at 5:41 am

I’m convinced that was a live voice actor used for dramatic effect and not the actual AI. There’s no way that was TTS

Reply
@akmonra says:

May 19, 2024 at 6:08 pm

@@EchoMountain47 how do we know they used TTS and not something new?

Reply
@EchoMountain47 says:

May 19, 2024 at 6:51 pm

@@akmonra TTS means text to speech. It’s not like one specific technology but a type of technology. Computer generated speech is always TTS on some level

Reply
@akmonra says:

May 19, 2024 at 7:19 pm

@@EchoMountain47 no, you can embed voice in latent space the same way you can embed text. you could have a model with pure voice inputs/outputs.

Reply

@ryan-tabar says:

May 14, 2024 at 8:20 am

I love how much humility they put into their demos. They arn’t just showing perfect case scenarious where the AI isn’t making any mistakes. What they are showing is progress.

Reply

@aiexplained-official says:

May 14, 2024 at 8:51 am

Yeah that was notable, and commendable.

Reply
@YeeLeeHaw says:

May 14, 2024 at 9:27 am

It’s more due to the model not being better. I wouldn’t call that humility, more so an over-promise and not being able to deliver.

Reply
@mohammadrahimjamshidi79 says:

May 31, 2024 at 1:20 am

5- AI, in order to improve its performance and prevent undesirable consequences, must continuously interact with “effective rules and stable principles in the realm of existence”.
@jamshidi_rahim

Reply
@mohammadrahimjamshidi79 says:

May 31, 2024 at 1:22 am

5- AI, in order to improve its performance and prevent undesirable consequences, must continuously interact with “effective rules and stable principles in the realm of existence”.
@jamshidi_rahim

Reply

@Hydde87 says:

May 14, 2024 at 8:32 am

I found the demo @11:53 the most impressive. It picked up on the not entirely kempt “developer” look of the person, made a comment about his hair being messed up and then understood he was joking with the hat. It’s one thing to recognize people, but to be able to pick up on the nuances of how people are expected to present themselves in certain situations is really impressive.

I do hope we get to tone down the ‘perkiness’ of the model a bit. It’s quite charming in 1 minute bits, but I think the overly positive attitude gets old fast if you’re communicating with it a lot over the course of the day.

Reply

@gmmgmmg says:

May 14, 2024 at 12:49 pm

I get Scarlett Johansson vibes in this demo

Reply
@Gerlaffy says:

May 14, 2024 at 1:47 pm

You could always just ask it to chill out a bit and it will adhere

Reply
@Hydde87 says:

May 14, 2024 at 7:54 pm

@@gmmgmmg Totally. I honestly think they did it on purpose to evoke comparisons with ‘Her’, and they’ve completely succeeded.

Reply
@anthonyzeal6263 says:

May 16, 2024 at 5:11 pm

@@Gerlaffy is right. You can change its personality in real time. Been doing this since 3.5

Reply

@Richievaillant says:

May 14, 2024 at 11:05 am

Apple acknowledging another company exists, is still the craziest news here.

Reply

@Jack_k32 says:

May 19, 2024 at 9:57 pm

They know Open AI is the future of technology and they’re jumping on it sooner than later

Reply

GPT-4o – Full Breakdown + Bonus Details

Related Posts

Joe Lilli