While it's not perfect as you said, it's getting closer and closer. I can't wait for F5 to support more languages. Thanks for this really useful tutorial!
Yeah it took me a while to find a movie with someone speaking in different emotions. But I think if we just pay some attention now, we will come across more sources.
Glad I found your channel trying to setup a 2-voice podcast sounds like this is the setup (Pinokio and F5). I use Udio and you can prompt it for FREAKY spoken voices especially when you do not provide a script it speaks in its own gibberish language! 😮
Just a quick heads up: The podcast feature was removed from the latest version of E2 F5. So if you plan on keep using it, best not to update E2 F5 until they bring it back.
I run E2 just fine and fast on a 3080 10gb. Both Pinokio and a full local git clone and install. XTTS similarly runs just fine, and my results sound better than this video lmfao. Even doing RVC, it’s pretty good going. Training is a little slower, but eh. RVC don’t give me the best results despite being Voice to Voice.
lmao with that final audio .. still good to use with some kindda horror movies with a weird cult doing magic spells lol .. Thanks for this priceless info
is there any method to use this voices with any opensource model. for example i have the voice of Jarvis or Friday or someone else and i want the same voice to read the output my ollama model generated in real time. how can we do that?
Yeah you are right. It was removed (replaced with another feature) but I can't find any information why they removed it. Can only hope that it will return in a future update. I guess the only way to use the podcast function now is to install an older version via git.Maybe I should include manual git installs in my videos.Thanks for letting me know
@@AiVOICETUTOR Please, I also need the podcast function... What is the "manual git installs" and how to do it?? Or, is there a way to install older versions?
I think there are models that can be downloaded when you are stop the tts, like when you want to update, there is a option to download other people's models on other languages
Yes you are right. The tool now supports custom models and therefore multiple languages. I'm making a video about it at the moment (which was supposed to be out last week already but I got distracted).
The video are real clips altered with face fusion. I wonder what's the best tool at this date to create a speaking head maybe just from a single input image.some online video creation tools can do this I think.but didn't even animate diff have the possibility to use audio with it?!
How can we use this for other languages? Should we download the cloned voice and open it in another program? For example, is there a way to do this or something similar for Turkish? Also, the cloning process takes more than half an hour; even after the voice is cloned, do we need to wait this long for every new text we write? Does the cloning process start from scratch for each text?
For now, only English and Chinese are supported. And the time for the audio generation depends on your GPU. You can't speed it up as the tool needs to do everything each time again you generate a text.
@@AiVOICETUTOR A desktop with an NVidia RTX GPU with 8 Gig. I think most of us with a 8 Gig GPU probably need to upgrade to a 16 Gig or higher GPU at this point. Most A.I. apps work slowly or don't work at all. Flux works but slow, If I add anything like a lora, it takes forever.
I sometimes get weird 'ghost' voices mid-sentence, freaks me out. it also repeats part of the sentence, I've been using 30 second reference clips and so I'll try cutting them down to 15 seconds, thanks for that, that might be the cause.
@solutionalwebdesignantwerp6122 these files are usually huge. It's a shame u cant redirect to another drive. So once downloaded I just find the folder in my pinokio c drive and move it. Thanks I'll give that s blast.
My question is: Will my hardware run the face fusion AI smoothly with these specs? Hardware Specs: --------------- CPU: Intel i7 10th Generation RAM: 32GB GPU: NVIDIA GTX 1660 (6GB VRAM) Storage: SSD Operating System: Windows 11
Smoothly is relative but you can definitely run it. Check this list to see where you GPU ranks in terms of performance: docs.facefusion.io/knowledgebase/gpu-capabilities
Smart use of the AI technology in your tutorial to prove how effective it is. You got a new sub!
Thanks, I really appreciate your comment!
While it's not perfect as you said, it's getting closer and closer. I can't wait for F5 to support more languages. Thanks for this really useful tutorial!
Lmao with that last audio 😂😂😂...
Found this video in my feed and subscribed after a few minutes ,great content
Thanks I'm glad you like it!
Thank you for the UVR mention. I've been cleaning audio manually and it can be such an involved process.
Yeah it's a really useful piece of software. Glad you find it helpful too
Great video with lots of advices for beginners like me. You got a new sub!
Can't wait to see what you will publish next.
Thanks! I'm glad you like it and thanks for the sub!
Cool and helpful! Thank you and you got a new sub!
Thanks! Glad its helpful.
Amazing. Now, I just need a great home workstation to put all of these AI tools to good use.
Yeah it's crazy what you can do on your own machine with ai these days.
thank you so much man i needed help with the emotion setting part this is great
Cool I'm glad it was helpful
With this F5 TTS you can program emotions into the text. I just had trouble finding versions of the same voice: sad, angry, happy, etc.
Yeah it took me a while to find a movie with someone speaking in different emotions. But I think if we just pay some attention now, we will come across more sources.
Glad I found your channel trying to setup a 2-voice podcast sounds like this is the setup (Pinokio and F5). I use Udio and you can prompt it for FREAKY spoken voices especially when you do not provide a script it speaks in its own gibberish language! 😮
Just a quick heads up: The podcast feature was removed from the latest version of E2 F5. So if you plan on keep using it, best not to update E2 F5 until they bring it back.
@RedCloudServices What version of F5 - TTS, with the podcast function, are you using?
Great content mate, to the point. Subbed.
Thanks man! Glad you like it.
E2 TTS require a lot of GPU VRAM, minimum 16 GB that is so high only 4070TI 16GB or 4090 RTX 24GB can run, I found on google doc
4080.. 3090..
If you’re on budget a used 3090 might be a good idea
I run E2 just fine and fast on a 3080 10gb. Both Pinokio and a full local git clone and install. XTTS similarly runs just fine, and my results sound better than this video lmfao.
Even doing RVC, it’s pretty good going. Training is a little slower, but eh. RVC don’t give me the best results despite being Voice to Voice.
lmao with that final audio .. still good to use with some kindda horror movies with a weird cult doing magic spells lol .. Thanks for this priceless info
Haha now that you say it, I can hear it too!
great content. You got a new sub.
Awesome, thank you!
❤❤❤❤ awesome
Thank you!
11:36 Thats why IS FREE if perfect will not be free any longer tsctsc
Well this will come in handy when I write music lyrics
is there any method to use this voices with any opensource model. for example i have the voice of Jarvis or Friday or someone else and i want the same voice to read the output my ollama model generated in real time. how can we do that?
Did they remove the podcast feature? I don't see it after installing through pinokio. I checked to make sure my installation was up to date.
Yeah you are right. It was removed (replaced with another feature) but I can't find any information why they removed it. Can only hope that it will return in a future update. I guess the only way to use the podcast function now is to install an older version via git.Maybe I should include manual git installs in my videos.Thanks for letting me know
I second that. It's gone, along with what seemed like emotion control with single voice sample
@@AiVOICETUTOR Please, I also need the podcast function...
What is the "manual git installs" and how to do it?? Or, is there a way to install older versions?
I think there are models that can be downloaded when you are stop the tts, like when you want to update, there is a option to download other people's models on other languages
Yes you are right. The tool now supports custom models and therefore multiple languages. I'm making a video about it at the moment (which was supposed to be out last week already but I got distracted).
Why is my audio output muted?
please help me, the audio and spectrum are both nothing just an empty audio file
If this is real I'm going to subscribe, meanwhile you've got my like, thanks.
Well it depends on what you mean by real. The voices aren't real but the methods I used to generate them are very real :)
whats the gpu requirement
Why is my output sound empty?
Can you tell me if this software can speak Vietnamese?
E2 and F2 are monetisable if i clone my own voice?
Great video here, which tool did you use to generate the short video clip used for Face Fusion and lipsync
Thanks! Do you mean the TTS button and the microphone? I created the images with Flux and animated them with Cogstudio.
The video are real clips altered with face fusion. I wonder what's the best tool at this date to create a speaking head maybe just from a single input image.some online video creation tools can do this I think.but didn't even animate diff have the possibility to use audio with it?!
@fixelheimer3726 Check out Echomimic2 for that purpose. I' hope to have a video about it ready next week.
@fixelheimer3726 Check out Echomimic2 for that purpose. I' hope to have a video about it ready next week.
The voices were great but I don't like the quality of those lip syncs. Thanks for sharing!
im trying this with the docker and i cant run it. is pinokio much better to run it?
I can't tell if it's better or not but it should definitely be easier to install via Pinokio.
How can we use this for other languages? Should we download the cloned voice and open it in another program? For example, is there a way to do this or something similar for Turkish? Also, the cloning process takes more than half an hour; even after the voice is cloned, do we need to wait this long for every new text we write? Does the cloning process start from scratch for each text?
For now, only English and Chinese are supported. And the time for the audio generation depends on your GPU. You can't speed it up as the tool needs to do everything each time again you generate a text.
Thanks, would’ve been nice seen a female sample, maybe next time
I managed to find a female voice that works well for me in F5 and you can hear her in my latest video: th-cam.com/video/-5fWC7RYXYk/w-d-xo.html
@@AiVOICETUTOR Great!
Anyone know why the podcast feature isn't included in the latest version of F5-TTS?
The functionality is still there. You can find more info here: github.com/SWivid/F5-TTS/issues/285
how come the podcast feature is gone
The functionality is still there. You can find more info here: github.com/SWivid/F5-TTS/issues/285
@@AiVOICETUTOR that makes more sense you can just add more "voices" didnt know it could be different people lol not the same person.
what about the languages?
There you go: th-cam.com/video/m_pucT9xqHo/w-d-xo.html
and for another languages? for example i cant use it in spanish or russian cause it sound with the american accent
How to configurate with amd gpu ?
Is there way to use it on mobile? Please I need an answer as soon as possible ❤
yes you can.
@@PeterNwawuba thanks
support Vietnamese?
my audio takes forever to synthesize is there any fix to this?
I'm afraid the only way to increase the speed is to run it on a better GPU.
When I run FaceFusion, I don't get the option of choosing a GPU just a CPU even though I have a NVidia GPU
Are you running it on a laptop? Which GPU do you have?
@@AiVOICETUTOR A desktop with an NVidia RTX GPU with 8 Gig. I think most of us with a 8 Gig GPU probably need to upgrade to a 16 Gig or higher GPU at this point. Most A.I. apps work slowly or don't work at all. Flux works but slow, If I add anything like a lora, it takes forever.
Is this F5 the best free Local AI TTS?
I have tried many of the free AI TTS tools and I get the best results with F5.
@@AiVOICETUTOR Unfortunately, it does not support Indonesian language. Do you have any suggestions?
I just made a video about TTS in more languages (th-cam.com/video/m_pucT9xqHo/w-d-xo.html) but I don't see Indonesian available yet.
I sometimes get weird 'ghost' voices mid-sentence, freaks me out. it also repeats part of the sentence, I've been using 30 second reference clips and so I'll try cutting them down to 15 seconds, thanks for that, that might be the cause.
Yeah if you follow some of the tipps in the video, I'm sure you'll be able to improve that
How about old laptop like 2019? Is it possible? Thanks
It depends a bit on the hardware but since it's not too old I'd say give it a try.
How you downlload to another drive? My c drive is ful 😢
You can perfectly move other folders to another drive ,for example I have my downloads folder and all my games running from an external SSD drive
@solutionalwebdesignantwerp6122 these files are usually huge. It's a shame u cant redirect to another drive.
So once downloaded I just find the folder in my pinokio c drive and move it. Thanks I'll give that s blast.
The Rock's gonna come looking for you...
can it make a spanish TTS ?
Not yet. I will make a video update if/when there is support for more languages other than English/Chinese.
My question is: Will my hardware run the face fusion AI smoothly with these specs?
Hardware Specs:
---------------
CPU: Intel i7 10th Generation
RAM: 32GB
GPU: NVIDIA GTX 1660 (6GB VRAM)
Storage: SSD
Operating System: Windows 11
Smoothly is relative but you can definitely run it. Check this list to see where you GPU ranks in terms of performance: docs.facefusion.io/knowledgebase/gpu-capabilities
You tried it yet?
@@nikyabodigital No I hate going down rabbit holes and fail.
didnt sounded like dwayne and also lip sync is not correct
It actually did
F5TTS should've been called Tortoise TTS. So damn slow!
I have an RTX 4070 Ti Super and it’s crazy fast. Update your GPU.
I wait for German Language :(
Yeah I really hope that we will get more languages soon.
These people have copyrighted there images and voices in all states. Meaning, when we use their property we lose everything we have to the courts.