Character voices with GPT-4o voice
ฝัง
- เผยแพร่เมื่อ 3 มิ.ย. 2024
- New voice mode coming 🔜
Voice mode is already available for all users in the ChatGPT app (tap the 🎧on the bottom right!) but our new voice and vision capabilities with GPT-4o (demonstrated here) will be rolling out in the coming weeks. - วิทยาศาสตร์และเทคโนโลยี
The ability to interrupt the AI while it's speaking is super useful.
Let's me exercise my inner a$$hole. ☺️
They implemented it into ChatGPT text a month or so ago. Before that, if you stopped it while it was typing you'd get a red error message and have to start a new conversation. Now you can stop it, tell it what to do differently and its like it understands in real time. It doesn't forget what it was just typing.
@@Severytree sorry but the ability of interruption was always by touching the screen. But now it's by voice. I have never seen the type of the problem described by you.
i will never interrupt it, might save my life in the future 😂
Somehow I find myself worrying it's becoming more and more annoyed with how rude I am and quietly plotting revenge.
So basically this demo overwhelmed the chatGPT servers and as a result millions of people now have to use google again.
Oh no. I’m running out of glue for my pizzas. I can’t keep using Google for much longer
God, dont use ChatGPT as a substitute for google!
What the duck lol, So they had to use the whole server to run this😭😭
@@EchoMountain47😭
Is it me or Google search got worse?
Scarlett: It sounds like me!
Openai: Jokes on you it can sound like anyone
Lmao
It wasnt even her either lol
@@Techtalk2030 Yup. Celebrities literally getting the privilege to get mad because they hired somebody that sounds like her, as if they're not within their rights to do that.
"you can't even hire anyone similar to me because, like, i'm ScarJo"
@@katto1937 Exactly. Celebs have massive egos; a bunch of egotistical narcissists. What's next? Is she going to threaten to sue another actress just because her nose is similar to hers? OpenAI needs to tell her to go to hell and bring back Sky.
@@BionicAnimations Yeah. I'm 100% sure they only did it because it would look bad on a legal record if they kept it up during the "dispute", not because they think it's wrong
Scarlet sue for a voice
ChatGPT: how about releasing unlimited voices exept yours
I think she made the right call. Imagine hearing your own voice everywhere. I'm sure they offered here a ton of money but you can't bribe someone that rich.
@@lawrencefrost9063 But it wasn't her voice. The actual voice actor who did all the work for this no longer gets her performance out there because Scarlet owns that tone and pitch? Nonsense. I heard the two, and I like Sky's voice, but never liked Scarlet's so they can't be the same.
@@lawrencefrost9063But it's not her. She's just being a narcissist.
@@AnD4DAgree, they sound different.
@@lawrencefrost9063 but its not her voice at all, outside the company referencing her a few times. It doesnt sound like her when you actually compare. There are plenty of celebrities its far closer to. notably rashida jones.
I just want Sky’s voice back
Yes, or at least a recent update from the company on what’s going on there. It’s been over two weeks at this point
Missing her also :((
I just want Majel Barrett's voice already
I want to be able to give it a few seconds of any voice and have it speak to me as my mum, dad, John Wayne, Elvis, Yoda, Darth Vader, the Grand Moff Tarkin or whoever the hell I want it to sound like.
I have sky in my app
"our new voice and vision capabilities with GPT-4o (demonstrated here) will be rolling out in the coming weeks."
How long will you keep blue balling us?
I WANT IT NOW!!
@@vectoralphaAII can see the clip as I read this message
lets goooo
The balls must harden and mature before harvest.
its all staged and not real
Right now more like GPT-4o4
lmao
I see what you did there.
Damn that's a pretty good one
Scar Jo’s lawyers are DDoSing
Hey I like that! 😆
I feel totally misleaded, I upgraded my account because I thought the new voice mode was available, to learn that it wasn't, and not only that, but 4 weeks later is not yet available to premium users, outrageous.
Same bro, so disappointing 😞
This comment just prevented me from doing the same. Thank you
Really people? Next time someone criticizes me I will remember your comments.
@@luisluiscunha fui tapeado mano 😂
Hook, Line, and Sinker.
Super excited to play D&D like this with my kids! They're gonna grow up in a such a different world!
We waited 4 weeks, where y'all at?
Missed the opportunity to ask "What would the fox say?" 😂
It's surprisingly bad at picking up puns.
Rindindin dindin dindindin!
@@ahtoshkaa 🤣
ning ning ning ning ning ning ning ning
funny they published this when a major outtage was going on in the background 😅
Major Outage 🫡
@@ChristofferLundWhat are you suggesting?
That laugh truly gave me goosebumbs. It sounded amazing!
It's insane😂
When demonstrating the capabilities of the ChatGPT-4o voice assistant, it would have been helpful to mention that its release was planned for the coming months but not weeks. It has been nearly a month now, and we are keenly awaiting its availability.
Exactly. We were all expecting it already. What the bloody stinkin hell! 🤬
They did in the ScarJo blog post. They changed it to "coming months". However after 2 days ago they changed it back to "coming weeks" so I'd expect it in 1-3 more weeks
Why don’t you build it yourself?
@@clownsheep22Yeah sure, let me read the book with all the knowledge necesary to build it (idk why google or meta doesn't read it lol), get all the training data (surely its public) and use my laptop with a gtx 1050 to train it in like 5 hours? Good idea!
@@metashadow24 it was sarcasm
I wonder what would happen if we said "imitate my voice".
Let me put it plain & simple, if you may, in a majestic authoritarian demanding deep voice: "When?"
will be rolling out in the coming weeks or months? xD
that's literally what i'm wondering
@@PawnPunch I just opened my app and it actually said "in the coming weeks"
@@PawnPunch the literally in this sentence is superfluous
the update is actually done, check your gpt
@@majoris_roborionnope, it’s still the old one
Remember when OpenAI said the voice update was releasing in a few weeks, several weeks ago? Now they’re introducing another feature with regards to the voice function and saying it’s releasing in a few weeks, while the last update still has yet to roll out lol.
Elon: 2 weeks.
@@catbert7?
They are not introducing another feature, this would have worked already at launch, they just showcasing.
2 weeks from now will still be several week from launch.
Not a new feature, just a different person asking it different things
0/20
That's pretty darned mind blowing.
Extremely well done. Amazing work!
They really are going after each and every profession here.. This time is the dubbers.
Hopefully. I can never watch a dubbed movie cause it’s always so bad, takes me out the movie. I always stick to subtitles
insert "first time" meme here
The dubbers are already jobless, AI voices were perfected like half a year ago.
Also, voice actors for animated movies and ads.
@@Gabri3lRocha So you'd rather hear a bad dub by a bot? Dubs aren't going away, they're just potentially getting outsourced.
I'm afraid to ask it to sound like Scarlett Johansson in 'Her'...
Too soon lmao
I do very much hope they let us dream up our own voices to use with the forthcoming chat feature tho. And I would NEVER create anything like Scarlett! 😙
@@EchoMountain47 Too late
Do you realise how pathetic you sound?
You should read up on that lol. It's a whole saga. Apparently they tried to license her voice, she declined and they created something very similar anyway. Watch the early demos of gpt 4o, it sounds eerily like her.
Johansson threatened lawsuits and they removed that specific voice so we're probably not getting anything like it.
I am really excited about this capability, but one thought I have as I watch the video is, while I understand the preference for the conversation dynamics to be pretty one sided (the user can interrupt at any point), it would be a neat feature to allow the user to have a fully bi-directional conversation where ChatGPT could also interrupt back (if desired)… getting the social nuances of that part nailed could make it much more of a realistic experience. Again, fully understanding that is not desired in some/most sessions, but it would be a fun feature to work on. Just my 2 cents. :)
great idea! I'm 100% with you!
Agreed, that’s what I’ve been thinking about!
Totally agree, that's what i've been thinking as well about chatbots. They only respond!
Yes, a nice idea Yes, a nice idea.👌 I am with you🙋
It's a fascinating idea, but the limitation seems to be that LLMs (large language models), and now these large-omni-models, are very much based around inputs going in, and outputs going out. They don't spontaneously have any thoughts at all. I think what will be needed is to continuously "prompt" the model with it's own "memories" and thoughts, so it is always thinking, not just responding. Then it might spontaneously talk to you, like a human. But, that would be super expensive, as right now they only "run" when answering your questions/prompts. Maybe someday though, would be cool to see even a demo of this!
They keep telling how good it is but they don't give it to the public, they just keep saying it will take some weeks.
One month later...
some weeks its been a god dang month they should have said few months
I was already impressed, but this really blew me away!
wtf openai what time update new voice features? it is already for a month❤
"ChatGPT is down!! What do we do?????"
"Release another demo to distract them!!!"
Lol typical poltician move 😂
Then tell them will release soon! In a few weeks. (again!) 😁
The update will arrive in weeks for some users, it will be an alpha version, the update for everyone will arrive in months. This is what they said in their demo.
My Whole thing is: WHEN WILL THIS NEW FEATURE FINALLY ROLL OUT? ... You guys have promised it, it's been 3 weeks now... But still nothing! 😞😞😞
Fr
Fr
It's fake. This feature doesn't exist because it's impossible. They keep delaying it hoping no one will notice, but they already showed their hand so now they're in damage control mode. Anyone with half a brain can see it.
Cancel your subscription to ChatGPT Pro and be sure to mention everything I said in this comment to make them understand we don't like being jerked around like this and it WILL cost them customers if they do it again!
From what I read in some forums, there will be an alpha for very few people and then beta in some months. So relax, because apparently we are not getting it soon
Wow… fantastic work
when will it be available?
I got a notification saying it's rolling out of the next several weeks.
@@theteknologist9574 that is what it said 2 weeks ago
@theteknologist9574 i got that notification Weeks ago🤡🤡
it's always in the coming weeks...
"in the coming weeks" which can mean anything from next week to 52 weeks from now😅
Any exact date? I'm checking the app multiple times a day right now lol
Same, 3 weeks already
Yes, like me, I also check constantly and this is annoying😑🙄
@@abdouwinner7653I wouldn’t bother checking, you’ll hear through the news that it is out when it is released.
Max 2 month from the announcement
To plus users that get alpha updates consistently, and because screen recording to chatgpt and live conversations out side of the app are already features now on ios, i’d say for those that get access to alpha features possibly 1 - 2 weeks at maximum, we all know this is powerful to the point it can become dangerous though. I doubt it’ll public release until the end of the year idk
🔥🔥🔥🔥 its crazy how good it is at listening and then understanding with a quick response
ChatGPT is at capacity now
ye
probably because of all the exams that the students are having rn
@@cherifb.2816 True
I wonder how they intend to handle it once the video vision capability rolls out. That will use up lots and lots of bandwidth.
@@HarveyHirdHarmonicsBandwidth is the last thing they need to worry about given that the gpu power required to process a video is much more constrained. Judging from how gpt4o indicated it was looking at the table during the demo, I bet they are just imeplenting the video features as a tool call, when you ask gpt4o about it surrounding it takes a photo of your surroundings from the video feed, nothing more. It is not like it streams the whole video feed to the server in real time. If they actually use a video feed, a frame will cost about 0.005 in api calls, take it 15 fps and few minute of usage will cost them more then your monthly chatgpt subscription.
only getting me more and more excited to try this for myself. like seriously y'all are really hyping me up for this
Yes! Finally more demos! I've been waiting, this model feels like pure science fiction still
"Oh, it's no one!", squeaked OpenAI, when Disney lawyers' asked "What familiar character is inspiring that mouse-like voice?"
That sounds absolutely noting like that character. And Disney don’t have the copyright on an anthropomorphised mice with high pitched voices - that’s a basic concept that predates Disney’s existence by thousands of years at a minimum.
@@citizen3000 You missed his point. He is making an intelligent joke, that ANY voice can be compared to something else that already exists. There is only so much variation to voices. So it is ridiculous for people/companies to complain about voice stealing when it is similar but not an exact copy or synthesis of another established voice. This points back to the Scarlett example, where the voice came from another human being and Scarlett thinks she has control over all voices that sound like her.
When the evil superintelligent AI has a creepy laugh, we can point back to this demo
this started it all
Iron lung laugh....
Whaaat?!? I told GPT just a couple days ago that I wanted it to have a Commander Data voice or a classic V-Tech robot voice! This is epic!
This model is not out yet, gpt4-o is only text to text for now.
^ text and image*
Oh man. Holding all of these at the same time is crazy.
WHEN IS THIS LAUNCHING FAM!
OpenAI has no intention of making this public anyway. Every day they're just like, 'mfk, isn't this amazing? Incredible! fking shit' and that's it.
💀 read the description
@@anml9962 yeah it's been a month
@@BTD6Tests trust me I know
I think the reason it's taking them so long is so that they can figure out how all users can use this without being limited to 13 messages or even 50 messages
why it's still not released yet
It's really amazing!
Can you do like Scarlet Johansson?
😂😂
"We will release this in a few more weeks, in a few more weeks then a few more weeks after that". They are literally giving us blue balls !
Exactly. They are now becoming the hot girlfriend who keeps saying she is gonna put out, and never do.😭
It hurts!
I thought yall said it will roll out in a few weeks a few weeks ago???
a month ago*
You are killing my mind xD
So, awesome!!!
I can't wait to try that!
0:56 That's really pretty good
Is this you @TheAiGrid?
Hey guys, can you go ahead and power cycle your router? Can't get in all day.
Maybe they just didn’t know that you can turn it off and on again
Guess they need to fire that IT guy who can't do the simple job.
@@fuzzyhenry2048Should replace him with AI
Sounds like an amazing way to do an audiobook once it's fine-tuned
The feature can be very helpful in learning and understanding something. Memorizing. Because we're biological creatures. And the same tone, voice, etc. can be boring. The brain adopting and doesn't want to memorize and even listen. The control of voice changing everything! Only imagine that it will be able you singing information. It was my dream. It can be very memorable 😊
It's was very impressive very beautiful
The interrupt feature seems nice until you realise that not a single person in the room can speak while it's responding, or it will just cancel the output
I disagree. Just tell it that there’s people talking in the background or there’s background noise besides your voice. Tell it to only focus on you and no ones else’s voice. For example if there’s two people talking and the other person interrupts. Tell it to focus on who’s talking before they decide to interrupt in order to identify if that person is worthy of interrupting the flow of conversation. It’s not a piece of metal. The thing is beyond intelligent so it would be an easy task.
Yeah, to be honest, however impressive these demos are, we have to admit that AI has a pretty long way to go until we reach that illusive AI dream that we really imagine.
It’s not a simple engineering problem to solve, that.
I would just focus on the positive - how much better it is that you can actually interrupt.
@@superhumandose People get jaded so quickly. The app is already impressive. It's like watching a nephew grow up everyday vs. seeing them once a year and all the sudden it seems like they are a different person. It's hard to appreciate incremental changes day by day.
@@animation-recapped Can it actually do this?
This looks really fun! It'd be cool if we could write a prompt for them to act out for like a kid's bedtime story
It got better with voice interruptions . Great. !
When will these new features be released?
Wow amazing!
Where’s Sky? We collectively want it back
When will we actually be able to use this new model? They say I'm using the new gpt-4o chat but when I talk to him by voice it's not like that yet.
In " the coming weeks" so prolly 2-4 weeks
The current 4o is only text and image capabilities. When they released it they said in the coming weeks, but it has been a few weeks already. A week or so ago chatgpt also displayed a message saying the voice will be released in the coming weeks. So maybe still a while unfortunately
I hope this couple weeks is really a couple weeks
Amazing stuff !!!
dang that´s really impressive
That’s actually incredible! BTW, are you guys going to give us an update on the fate of the Sky voice? It’s been a couple weeks now
Dude i MISS the Sky voice so much.
You guys are so creepy, holy shit.
@@SW-fh7he how so?
@@SW-fh7hebro the sky voice was a complete different voice actor and sounds different than scarjo. They already had the sky voice MONTHS before ever contacting scarjo.
@@alexdoan273 simping over an ai voice
When? when? when? ah…..
When?
Omg, I can't wait!
When the update to access this voice capability please 🫠
You’ll live.
@@citizen3000 When's the update to access this voice capability please
2 weeks ago OpenAI said in the coming weeks
@@millicentwilson9567 It’ll be ready when it’s ready. They didn’t give a release date for a reason.
@@citizen3000 oh ok, when will it be ready?
Crazy! Thanks for sharing
How long til the update?? 😢😢😢😢😢
Why tell people it’s coming out in a couple of weeks when it’s been a month now
Release this ASAP😍😍😍 I’m in love with
When will we get the 4o voice???
It's probably the end of the year.
I am pretty exicted to produce my own Disney Movies one day. Sora for animations and a voice model like GPT-4o for the character voices. It's for real incredible what OpenAI produces
OpenAI you said that it will appear in the coming weeks, it’s been 3 weeks, it’s really sad to watch, give us access in the next update, please!
Yes!
It’s time
I’m tired of waiting, I want to cancel my subscription
i’m thinking of cancelling chat gpt subscription because of that, ehhhh
Ikr every day I check to see. I seriously have never been this excited
Monolithic multimodal transformers are pretty wild.
Very excited about this tech.
The Unified-IO paper on arxiv gives a good overview of a similar multimodality approach
That’s exactly what I was envisaging, creating my own little skit play with ChatGPT and doing character voices. I just wonder how big the memory will be.
WHERE ARE THE: NEW FEATURES???
Hello Open AI. There is a lot of hype surrounding the new Chat GPT voice feature.
However, on the application on my phone that I just downloaded, I cannot find the option: new features.
I am French currently in Indonesia. Would there be restrictions in certain countries?
Why can't I find this feature?
It is not available for public yet!
@@Crazy_Truth Oh thank you very much for this answer !
Can you say "rolling out in the coming weeks" in a really sarcastic voice?
This feature will be my deciding factor if I should keep my plus subscription. Because Claude 3 Opus has been great for me!
This is legit going to drop when GTA 6 drops
Just imagine what this tech could do for easily creating immersive and interactive NPC's in video games.
Yeah, that's gonna be so crazy. They can also have schedules and lives and everything.
@@parttimehumanYeah but that's gonna need some RTX25000
@@rejhan9142 run it in the cloud
@@rejhan9142 Power requirement - 1 nuclear power plant
That is effectively what this video was about...he was creating voices for NPCs in a story.
I wonder if "coming weeks" is 1 digit, 2 digits, or 3 digits.
Christmas is in the coming weeks too. Christmas 2032.
Maybe it will be able to remember the voices. It would be really cool if you can set up the voices and then ask it to tell you a story while it changes though them. Though, maybe we aren’t at that point yet
Finally!
I will be able to ask the AI directly for the type of voice that I want.
I was already tired and frustrated by current programs like ElevenLabs where they only give you several voices to choose from and I almost never find exactly the type of voice I'm looking for.
I CANT WAIT TO GET ACCESS TO THE NEW VOICE MODEL!!!!!!!!
I would love to have a big robotic family by 2050... My Robotic Clan would have several shapes from humans like robots to robotic mythological creatures
"Don't mind me, just taking my self-driving dragon-bot to work!"
@@MurasakiYugatawhat kind of work you do when you have that level of robots?
I too can't wait to start a family of robots.
@@therealOXOC I mean, I'm hoping for universal basic income, but I'm not entirely optimistic.
@@MurasakiYugata Let's hope for the best but it's gonna be rough for sure for a couple of years while everything reconfigures.
This is great! I hope the rollout includes the API!! My AI video editor tool(Shorz) needs this!
I can't waaaiiitttt! 😁 Hyped for public access to this! Is this model that's rolling out the one with native audio input as well? Or is it still doing Whisper on the input and we're just getting access to the native audio output first?
....WOW...just......wow..
how about you do the actual thing and drop the feature instead of showcasing it from far away?
Chill out Jamal😂😂
Would be a blessing for people working on animations.
very goooooooooooood Is it possible to imitate Gandalf's voice? ? ( Can he make a song too? ؟ )
Been few weeks when we're getting this
It's the future of translation
or death of audible and storytel
O
Connection failed, click to retry
It really looks like AI is going to take everyone's job.
😂😂😂😂😂
Wow! Just wow! 😳
🇧🇷🇧🇷🇧🇷🇧🇷👏🏻 Wow, that is just totally amazing. The full version of GPT-4 Omni to the public will be a terrific deed. As Sam Altman wrote on X, it will be totally worth the wait!
Why TF does openAI do this to their paying customers? They tease this and hype it when they announce it but don’t even have a set timeline for when it will release. Same with Sora.
Wow this is awesome
I am SO SO excited about this! Truly such a remarkable update and company. I sure hope I will be in the Alpha because I just cant wait!!!
When will your couple of weeks end!? 🤔