Introducing Speech To Speech: Elevenlabs Unveils Mind-blowing New Feature!

Bob Doyle Media

มุมมอง 47 524

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 18 พ.ย. 2024

ความคิดเห็น • 104

@BobDoyleMedia 10 หลายเดือนก่อน ⁺¹
For a free alternative for speech to speech, check out this video: th-cam.com/video/Usua2LnnX4g/w-d-xo.html
@SuperEliasTM 7 หลายเดือนก่อน ⁺⁷
I'm editing a Wedding recap video and theres a section where during the bridal party speeches the microphone kinda cut out and ruined the flow of what they were saying. I used this to recreate their voice on the sections that cut out and my goodness, it's wonderful. Truly a gem for issues like this.
@marcdevinci893 11 หลายเดือนก่อน ⁺¹²
Game changer for sure. No more cryptic use of punctuation to try to get the right flow and inflection on words and multiple re-rolls of lines. Brilliant
@PhilAndersonOutside 11 หลายเดือนก่อน ⁺⁶
I've tried over a dozen different AI voice labs, several are good, some are not. The one I decided to go with was Eleven Labs.
@joshstone5227 11 หลายเดือนก่อน ⁺⁷
I don't see nothing wrong with using it to speak with passed family members if there voices are saved, it can help people who are grieving
@Fivemacs 7 หลายเดือนก่อน ⁺²
That's not letting go, not grieving. That seems unhealthy.
@komakaze1 11 หลายเดือนก่อน ⁺⁹
I'd love something that can read e-books to me with emotion. Some Text To Speech voices are good, but completely robotic in their emotional emphasis.
I've encountered audio books where i like the story but the voice who is reading it is not to my taste, especially during dialogue of the opposite gender to the reader.
It would be great if there were easy AI solutions to both of these.
@moltenpros 10 หลายเดือนก่อน
It's going to cost too much. If you have lots of money then go ahead.
@wasthataflute 10 หลายเดือนก่อน ⁺¹
Good, fun demonstration. Just the tool I've been looking for. Thanks.
@MrVapi23 11 หลายเดือนก่อน ⁺¹
I appreciate your time ✌️
@bobhawkey3783 ปีที่แล้ว ⁺⁷
Nice. I've used some Replica voices because they have good emotional weight but poor voice clarity. Thanks for this.
@DawnPeacockOwens ปีที่แล้ว
Should we try it?
@saintfame23 10 หลายเดือนก่อน
@@DawnPeacockOwens just try respeecher
@DawnPeacockOwens 10 หลายเดือนก่อน
@@saintfame23 we already have the subscription to Eleven labs , so may as well use that one first
@saintfame23 10 หลายเดือนก่อน
Respeecher have this feature and specialized on speech to speech technology. Try them as well
@DoctorKusanagi 11 หลายเดือนก่อน
I extensively use Eleven Labs and I love it
@Vifer09 11 หลายเดือนก่อน
This is awesome I’m hoping to use it to answer customers that ask the same question over and over on the phone so I don’t gotta sit there for 10 minutes the turbo feature will make it seem real I hope
@kevnar 11 หลายเดือนก่อน
I would love to use this technology, if they didn't nickle and dime you for every little character you use.
@TheBlueRage 10 หลายเดือนก่อน
A workaround could also be Creative Commons impersonators.
@TheBlueRage 10 หลายเดือนก่อน
5:29 what generator did you use. I have a face swap software one with no sound and D-ID didn't allow me to use a famous person although the image was ai generated.
@luminrabbit9488 11 หลายเดือนก่อน ⁺²
Whoa, this is awesome! Quick question, what was used for the Liam Neeson headshot movement (face over), I’m hoping there’s an API out there somewhere..
Thank you keep up the great work!
@MickPerezRealEstate 7 หลายเดือนก่อน
That's what I want to know...did you ever find out?
@Molandria 7 หลายเดือนก่อน
I'm just so lost. I'm trying to start streaming online using a voice changer, speaking live and having the voice changed live, and I'm trying to clone a voice for this purpose. Do you know of anything? I can't find anything. Every time I search for ANYTHING on this, I keep getting "text to speech" options, or cloning voices that ultimately result in text to speech only options.
Is what I'm looking for a thing? I don't know what to actually search for. ;(
@dantestaccato 10 หลายเดือนก่อน
I tried Elevenlabs but had better results with Vocs AI speech to speech
@jonathanrice2568 7 หลายเดือนก่อน
Hey Bob, do you mind if I ask you achieve such perfect background substitution?? Thanks.
@vivektyagi6848 11 หลายเดือนก่อน ⁺²
Awesome. 🎉 Many Thanks for sharing face fusion and 11 labs 🎉
@NTSHMA 11 หลายเดือนก่อน
You can really test the voice AI with something like a "love note" reading. They sound like a business transcript... funny though.
@thewebstylist 11 หลายเดือนก่อน ⁺¹
So grateful for 11 and my subscription to!
@BobDoyleMedia 11 หลายเดือนก่อน
Yeah, it's getting to be a better and better value!
@MeAndMyRoyalEnfield 11 หลายเดือนก่อน
Just today tried Descript for the first time for a lot of text I have to read. It, or I, sound like I want to take a long walk off a short pier and when I'm half joking I can't get that light hearted flavor tone to come out. I may try ElevenLabs tomorrow?
@douglaskastle 11 หลายเดือนก่อน
What is it like re rendering your own voice. I am thinking of a bad recording, like in a noisy cafe, or echo-y room and redoing it so it sounds studio quality. Bonus points for 2 people talking a spearting them out into different tracks.
@IamAaliJah 11 หลายเดือนก่อน
Well, I know the feature but I like your way of presenting it, jsut love your videos. I am also Liam Neson's big fan.
@nigeldogg 11 หลายเดือนก่อน
Please make videos about open source solutions for this 🎉
@ReadyToFly24 9 หลายเดือนก่อน
I use it to clone my own voice for my videos, as i tend to slur some words and not keep tempo.
@BobDoyleMedia 9 หลายเดือนก่อน
It definitely isn't perfect, and I've had to do some re-recordings to work around that very issue. How long was the sample that you sent it, and does it have any examples of the word that is slurring?
@ReadyToFly24 9 หลายเดือนก่อน
4 clips 30 seconds long seemed to work. read from a script I found online. @@BobDoyleMedia
@AT-os6nb 11 หลายเดือนก่อน ⁺¹
so where does this leave security? voice print identification etc..... crazy. whatcha out you don't get cloned!
@BobDoyleMedia 11 หลายเดือนก่อน
It's a likely thing, no doubt. But I think we probably all already are...
@gadgetgrader 11 หลายเดือนก่อน
Hey Gen will do this also
@RexSmithII 4 หลายเดือนก่อน
Can you do speech to speech with output voice a cloned voiced?
@ThyLegohood 11 หลายเดือนก่อน ⁺¹
Emily definitely sounds like she isn't thrilled about what you're wearing.
@alia8766 10 หลายเดือนก่อน
What microphone are you using? The quality is great
@BobDoyleMedia 10 หลายเดือนก่อน
It's a Blue Yeti Pro. I have it as close to me as I can without it being in the frame. And I also may be running some compression on it, depending on the video.
@richardsaddress580 10 หลายเดือนก่อน
Does anyone know if there is a service like this where you can purchase or download your Voice and add it to your Apple or Windows computer?
I want to do an audiobook of my late great father, reading a public domain translation of the Bible. If I’m limited to an amount of words or minutes, I’m gonna spend 1 trillion billion dollars getting that done…
@hecaz7052 4 หลายเดือนก่อน
It's possible to change the voice and change some text from the audio too? To change for example the speech of a film
@BobDoyleMedia 4 หลายเดือนก่อน
@@hecaz7052 you could certainly use this tool in conjunction with lip syncing software to do something like that.
@hecaz7052 4 หลายเดือนก่อน
@@BobDoyleMedia ok because I'm watching a lot of videos and I can see you can change the voice, but not the text... I wanted to be sure I can before paying for it :)
@IdeasThatHeal 11 หลายเดือนก่อน
Fun stuff! Thanks
@thethoughtfield 11 หลายเดือนก่อน
you've an amazing voice, what do you need this for?
@BobDoyleMedia 11 หลายเดือนก่อน ⁺¹
As I say in the video, now I can use whatever acting skill I use with my own voice and then apply it to others, thus being able to create a stable of characters that I can offer or use for my own projects.
@quizwell 11 หลายเดือนก่อน ⁺¹
great vid - we live in exciting times
@BobDoyleMedia 11 หลายเดือนก่อน
Indeed!
@victoriatrestrail 11 หลายเดือนก่อน
Brilliant! How can I do a clone of a singing voice?
@BobDoyleMedia 11 หลายเดือนก่อน
There are definitely solutions for that which I would love to cover eventually here on the channel. Do a search for RVC voice clone and you’ll find your answer.
@Morrisseys7thFriend 10 หลายเดือนก่อน
I tried it and it makes the result all jumbled up
@plushtownevents 9 หลายเดือนก่อน
Hey Bob, nice vid. You want to know how I use it? I’m one of the pre-made Australian voices on ElevenLabs - Friends send me all sorts of crazy stuff people just e used my voice for everyday! As a producer I use it for allowing me to perform reads in voices I don’t have - I even did a read in a 30yo Aussie female voice recently :)
@BobDoyleMedia 9 หลายเดือนก่อน
Precisely! That's just the kind of use case I'm talking about. I have another video that addresses this specifically for VO professionals: th-cam.com/video/edNQd2LgBrw/w-d-xo.htmlsi=PbK0i5jN_BeylcB7
@StevenWebb 11 หลายเดือนก่อน ⁺¹
You could have picked me!
@Daemon1995_ 11 หลายเดือนก่อน
hmm so how do they calculate the limit using speech to speech?
@BobDoyleMedia 11 หลายเดือนก่อน
My guess is that it creates a transcript of what is being, said, and counts the characters.
@ddrci88 11 หลายเดือนก่อน
What is the open source voice cloning best app then ?
@BobDoyleMedia 11 หลายเดือนก่อน
Personally I like RVC. With my 3090 I can clone a voice with about 20 minutes of audio in around 30 minutes that sounds pretty good, and can then convert recordings like this, or do "real time" conversion, with a delay depending on your GPU.
@Kelvinapplegate หลายเดือนก่อน
Emily is kind of an Emily Downer is wild, or melancholy. 😂😂😂
@markmatthews1972 11 หลายเดือนก่อน
how much text/time can you upload at one time?
@BobDoyleMedia 11 หลายเดือนก่อน
They've changed it since I did the video, and it looks like they'll take up to 50MB audio files. That's a lot!
@mrhoneystinger3676 11 หลายเดือนก่อน
The thing that concerns me is if you upload your voice to ElevenLabs are you giving them permission to use your voice somewhere else without compensation
@BobDoyleMedia 11 หลายเดือนก่อน
I’m not sure that’s 100% true but I will certainly look into it. I think you have to give them permission to use your voice in their marketplace.
@rhondahoward8025 11 หลายเดือนก่อน
It's still a little wonky. The voices can still sound slurred and drunk with the replica feature.
@SynthwaveDuck 11 หลายเดือนก่อน ⁺³
Loved the Deepfake ending
@BobDoyleMedia 11 หลายเดือนก่อน ⁺³
:) Thanks. That's Facefusion. Going to do another video on that.
@aldiergreen 11 หลายเดือนก่อน ⁺¹
what if you sing it?
@BobDoyleMedia 11 หลายเดือนก่อน ⁺¹
Unfortunately, it does not work, at least in any of the tests I did. Generally, the models require a slightly different type of training if you’re going to use them for singing, but I’m only speaking about training approaches that I know. I really don’t know what ElevenLabs is doing.
@aliruane 10 หลายเดือนก่อน
It still sounds generated to me. Delivery is flat and unnaturally inflected
@BobDoyleMedia 10 หลายเดือนก่อน
Well, I think some voices are better than others, and I believe that like with most AI things, a lot of it has to do with the data going into the model. If the read going in is flat, that's all you're going to get out. That's why I have several models of my voice with a range of modulation.
@IndePro-z1y 8 หลายเดือนก่อน
No hate for 11labs but doesnt Vocs AI & Kits already do this?
@BobDoyleMedia 8 หลายเดือนก่อน ⁺¹
I’m not familiar with these as you’ve listed them. Do you have a link? I’d love to check them out!
@sorijin 11 หลายเดือนก่อน ⁺¹
This is dope, also grapes are toxic to dogs big FYI for those who don't know
@BobDoyleMedia 11 หลายเดือนก่อน ⁺¹
I actually do know from first hand experience. Lost my chihuahua after only 2 years. We had no idea, and he loved them as treats. Hard lesson.
@Wasaia 11 หลายเดือนก่อน
@@BobDoyleMedia😢❤
@xXWillyxWonkaXx 11 หลายเดือนก่อน ⁺³
Here's a question, do you think Eleven Labs can get to a point where voices are synthesized in real-time instead of submitting an audio sample file and it churning through it and then spitting out the end result like it has (which is impressive still to say the least)
@BobDoyleMedia 11 หลายเดือนก่อน ⁺¹
Whether they do it or not, it's hard to say - but will they be ABLE to? No doubt.
@stedbenj 11 หลายเดือนก่อน ⁺¹
I've seen videos of a man who is using AI on his PC to change his voice to a female anime character in almost real time. The technology is pretty much there.
@danielle78730 11 หลายเดือนก่อน
what engine is he using to do this…
@sirdrak 11 หลายเดือนก่อน
@@danielle78730 It's w-okada AI voice changer, opensource, free, local and easy to use... And you can use it in realtime in games,Discord, etc... and every app with microphone support...
@Gray-Today 10 หลายเดือนก่อน ⁺¹
"Cloning" is, making an identical copy. Cloning is NOT voice-to-voice. Voice-to-voice is a conversion. The two are very different. I'm having a tough time finding out what this program is capable of, thanks to illiterate use of terminology. Many are.
@BobDoyleMedia 10 หลายเดือนก่อน ⁺¹
Well, I feel like the word "cloning" describes enough for the general public what the result will be. And I guess the term "conversion" isn't as sexy. I get your point, but is it really unclear what the program does? From your viewpoint, I'd say it does exactly what you said: converts. It converts text to speech, and it converts voice to another voice, clearly based on some kind of AI model that is created amazingly fast. "Cloning" or not, I'd say it's amazing.
@Gray-Today 10 หลายเดือนก่อน
The program itself is written using non-standard terminology. Most are today. You are obliged, I think, to use the same terminology as the product, right or wrong. What you think is a "sexy" word is irrelevant.
Yes, it is unclear to those who may have a technical vocabulary. Another example is "AI." It has no definition at all. It means "really cool," right? There are quite a few examples. It's a sign of the sorry times we live in.
If you must use undefined terms, consider adding a link to a glossary.
@@BobDoyleMedia
@olexiisokolov 11 หลายเดือนก่อน ⁺¹
only english((
@BobDoyleMedia 11 หลายเดือนก่อน ⁺¹
Yes, good point. Forgot to mention that! Obviously, that will change any moment. :)
@olexiisokolov 11 หลายเดือนก่อน
@@BobDoyleMedia hope so)
@xHeadcleanerx 11 หลายเดือนก่อน
Elevenlabs has multilingual model too.
@olexiisokolov 11 หลายเดือนก่อน
@@xHeadcleanerx Not for Speech to Speech yet
@southcoastinventors6583 11 หลายเดือนก่อน
What the legality of dead actors I wonder
@TXanders 11 หลายเดือนก่อน
It's can be illegal /harassment/defiling and infringement, unless you get permission, for the dead it's considered false light. The bottom line is, you don't own it. Regardless of legislation not updated fully yet anywhere, there's the morality of it too.
Even the end of this video is infringement on the look, even though it's a satire and genuinely means no harm. Pay for actor release forms, even yourself.
@southcoastinventors6583 11 หลายเดือนก่อน
@@TXanders There no morality for the dead they do not need to worry about that. It probably more a public domain thing so maybe after 50 years or less depending on living heirs trying to cash in for work they didn't do.
@BobDoyleMedia 11 หลายเดือนก่อน
Yes, your point is totally valid. I guess I'm just "going with it while I can" until firmer rules are in place - but it's going to be hard to backpaddle on this tech, so it will be interesting to see what kind of legislation is created. In my case, I'm always making it evident that it's AI, so I believe that this is currently acceptable to TH-cam, which suits me just fine.@@TXanders
@Anarchy-Is-Liberty 11 หลายเดือนก่อน ⁺¹
ROFLMFAO!!! That was great!! ha ha ha!!!
@bravo1oh1 11 หลายเดือนก่อน ⁺¹
Changing the name Emily to George is transphobic
@zeitakulobusta 10 หลายเดือนก่อน
Huh?
@nattsurfaren 11 หลายเดือนก่อน ⁺¹
The value of putting your face on youtube will not have any value at all because people will think it is all fake. Well I think it is kind of good because then it boils down to the value of the content. But I predict people will upload hundreds of automated content every week all AI generated so it is all going to be BS. Maybe AI will generate uniqness as well so it will all go to BS anyway. Welcome to this BS future.
@dinoscheidt 11 หลายเดือนก่อน
It all has been fake for years. The focal length of the camera is already changing how you look, video is color graded, using a green screen, high lumen lighting,... Does it matter? No. People look at faces, eyes, mouth, gesticulation, mimics,… because it’s a strong part of the humans multi-sensory toolset for communication and understanding.
@brytonkalyi277 11 หลายเดือนก่อน
\>

ต่อไป

เล่นอัตโนมัติ

Speech to Speech is HERE and it’s EPIC! Latest AI Feature from ElevenLabs Blows My Mind