Thanks for your great explanations, found you by accident. Q ...I want to sample a vocalists voice , as singing and not spoken word. What A! app would you recommend please as I dont think anyone covers this need etc , thanks for any tips
hey, I have problems with the Polish language because the generator reads a instead of Polish ą. how to solve this problem? what does it depend on? Did I train the model wrong?
this is pretty rad, wish i knew Applio existed back when you released this video because ive been fumbling with various TTS webuis since last year and none have been the right "it just works" fit. Until now. Thanks, very helpful vid.
Hello my friend! Seems like you've read my mind this morning. I was specifically looking for free TTS platforms for the last 2 hrs with no luck, praying you'll come up with the solution in one of your videos, and there you are right on time. Thank you so much for your content!
Thank you for this guide! So far I've trained 3 voice models, using it. The first 2 took 2-3 hours, but I've got a fairly slow system. The third one has all the exact same settings as the first 2, but a HEAVY British accent, and a 30-minute audio sample. 11 hours later, it's only on the 21st epoch, and almost 1800 steps. Should I go ahead and stop it? Why in the world is it taking so much longer than the others, does the size of the audio sample make that much of a difference?
I just found out the reason for this, and it's not really either of those things. First of all I didn't realize that training settings were not saved, so I tried to train a new model without "cache dataset in RAM" selected, which made a big difference. Then I realized I had accidentally put 2 MP3's in my dataset folders, instead of WAV files. MP3's do not process anywhere near as quickly, as you might well imagine, due to them being compressed! I had tried to replace one of them and actually put it back into the training folder WITH the MP3! NO WONDER it took so long! I also no longer had "Overtraining" protection turned on, so those were likely 12 wasted hours anyway. I've already started on a new model from the ashes of that one, and already it's going 75% faster.
great stuff but any idea why im getting a red error message when i try to do any actions? and how to download voice models? my list is empty only showing big list in TTS
Great videos man. I want to know, is it safe to run? When I am running the .exe file, it is showing a message where it states that I shouldn't run this. How do I?
Hello, I'm searching for the answer and I can't find it. I wonder if there is a way to somehow scale the gain value of the RVC model when converting a recorded voice (inference tab)? So the model change the original voice for example in 50%
finally! This is the exact tool I need! I'm trying to recreate some decades old forgotten/lost media that will be gone forever unless someone saves it, and this is exactly the tool I was looking for. I have been trying voice clones with text to speech free online with mixed results. I am all for ethical use of AI and will exploit the heck out of it for good ends
Hey bob, when I am in the text to speech mode, my index file contains nothing while all the rest of the drop downs are populated. Would you have any idea why this is happening?
Thank you, finally one video that explained in layman terms, most of the videos I have been watching regarding these text to voice software just ending explaining in a way they assume everyone in the world to be programmer. Thanks a lot for the video and the software, no way am I a layman in terms of computers n graphics but not every single designer is a programmer, this video helped a lot.
That's a combination of (at the time) ElevenLabs speech to speech (which you can do with Applio, covered in this video) and then a program called "FaceFusion" for the faceswap. I've covered how it's used in other videos, but haven't yet done a dedicated FaceFusion video - so I guess it's about time to do that. It's also a free solution. th-cam.com/video/20o-5orWrHU/w-d-xo.htmlsi=Tqu_fDvOz43r35tF&t=1036
i first got to know applio when searching for voice synthesis. then I realized applio is using edge_tts internally, I ended up using edge_tts directly, which is much more light-weight. This time round, I hit applio again searching for voice cloning. Does anyone know what is the underlying library applio is using to clone voice? I'll eventually dig into it myself if I do not get any replies soon. First time watching Bob's video. I feel it's well organized and and condensed with the information viewers want. Excellent work!
The TTS in Applio is NOT running locally. The "predefined" voices are loaded through a Microsoft API and the TTS is performed by them (it uses edge_tts). Keep that in mind when you upload the minutes of your top-secret meeting or when you make Squidward say really nasty things.
The TTS voice is completely dependent on the generated file that is created before the conversion takes place. I would go through and test a lot of those and see if any of them sound real. Otherwise, I would just record my own voice, saying what I wanted to say how I wanted to say it and then convert that.
Thanks for the video, it was very helpful. I managed to train according to the video and created it. am a model. also created the index file, but unfortunately I couldn't export it because it doesn't see it. You don't see the index file in TTS either. Where can I find the path to the index file. I want to ask for help.
Hi Bob! I just recently became aware of this tech, and I decided to use it to create a model of my own voice. I literally came across this video when trying to figure out how to make RVC work, and I immediatly installed Applio and damn it works so fing well!!! The reason I wanna create a moadel of my own voice is so that I can streamline content creation for short video format to post online, I did the math and having to record audio for every video would be a pain in the butt, so I figured why not make a voice model and use TTS to easily narrate my scripts. As I'm writing this I'm looking at my other screen and it seems like the thing is being trained, so I cross my fingers. Thank you so much for the help with this, it was super easy to follow! Do you think TTS is a realiable way of generating realistic narrations? Because so far it sounds a bit robotic. Doing Audio to Audio is not really an option in my case.
TENGO INSTALADO RCV EN MI COMPUTADORA PORTATIL. HE CLONADO MI VOZ PORQUE QUIERO HACER AUDIOLIBROS CON ELLA. PERO NO ENCUENTRO EN DONDE INGRESAR EL TEXTO DE LOS LIBROS PARA QUE SE GENERE EL AUDIO CON MI VOZ CLONADA. PUEDE DECIRME EN QUE PARTE O COMO PUEDO HACER ESO?....
HELP! HOW CAN I USE APPLIO OUTSIDE OF THE GUI! I Need to load models and generate tts via text on some sort of programming language. I cant work with the gui.
Thanks for producing these videos. I do have a question. These voice clone services are pretty amazing, but do they all work based on English (or other major known languages) only? As of now, is there a tool available that would basically just mimic a reference vocal track (regardless of the language) and reproduce it with another voice? For example: You did a video where you sang a melody and then replaced it with some male/female AI voices. Does that only work if you sing in English or other officially trained languages ? In other words, does the AI look at the lyrics to reproduce the voice? If that's the case, then for the time being we can only use them for certain languages, right?
Thank you Bob for your very interesting and well-documented videos. Whenever you can, could you make a video to apply this technology to music? Is it possible to use this system to create a model to be used as a singer's voice?
Actually, that’s exactly what most of this is used for. I have several other videos on this. Check my last few post and you’ll see that they’re very music centric.
@@BobDoyleMedia Thanks Bob, but I wanted to understand if Applio can export a model to use with Replay to replace its own voice in a song created, for example, with Udio. If so, could you briefly explain how I can do it?
@BobDoyleMedia I would like to say thank you so much for making this video and showcasing this software / AI tool. For the last two weeks, I have been exploring different AI tools to clone voices for a persona project and each AI tool I were experimenting with wasn't working correctly or was simply broken. Applio is one of the most stable tools I've seen during my two weeks and I didn't need to install it via Git installation or other complex means. The training segment isn't working sadly but I can use RVC itself to train the data and input the final voice model into Applio and it works... Finally! Again, thank you so much for your support and for sharing this information... I'll look forward of seeing your future videos. Take care!
@@CRSLifeLens I don’t think that’s accurate. I just followed the link myself and see the download button which takes you the GitHub which has the instructions.
Great tutorial Bob! Have you tried the Voice Training feature (currently in Beta) recently included within the latest version of the Replay AI software? I thought it'll be great material for you to make a video of. Looking forward to watch that tutorial and hear your opinion on its quality if you end up making one. Thank you for this one! Cheers!
@@BobDoyleMedia That is super cool Bob! You are the best! Thank you man and cheers to the success of your channel! Looking forward to your next videos!
I'm getting the error "Failed to train index: need at least one array to concatenate" I'm not a tech guy and can't find any explanations that I can understand. I have a NVIDIA GeForce RTX 3050 6GB . can anybody simplify it it for me?
I have other AI AI apps installed using Python, Conda, Miniconda etc. This one will not install using the .bat file. Miniconda is in the system environment and this app can not locate it.
@@davidpayne5964 it’s been a really long time since I installed it and I’m imagining they probably done some updates, so I don’t have firsthand knowledge with your issue though I wish I could offer some assistance.
Great video. although, can anyone confirm?: I want to take my voice record it enough times to where it becomes a voice my kids can use to have me read them stories when I am not home. Is this possible? Seems like this app might do this but couldn't figure it out from watching 75% of this video. I thought it did but now I am not sure. I mean, I see apps where you can choose any voice to read text to speech for me but I want to train my voice so my kids can use my voice. How awesome would that be to have your dad read stories to your kids or grandkids long after he is gone!
Yes, in theory you can do that with this, though the text to speech with this tech is not going to have the same level of conversational realism as something like 11 labs. I also just did a video for something called ChatTTS that you might check out: th-cam.com/video/3speJ3ThW7I/w-d-xo.html
great videos. would you please consider creating a video on how to use the downloads section of the UI to download a model of 'Stephen Fry's' voice from huggingface, then use this to read a text file?
Id love a comfy ui workflow and solution. Anyone have an idea how to get openvoice by hay and if its any good? I almaot made it run but i couldn't get melo tts to install
thanks again i recently downloaded this but was a lil bit let down by some models, but the voice to voice mode is so damn useful! thanks! maybe another video about the best voice models? greatly appreciate your vids!
i watch severeal channels of IA but only yours give the best aplications advices
we need more content on Applio. please make a video on new version of Applio
Does not work complains about "Miniconda installation failed." even though it is installed
Such a wonderful step-by-step tutorial 💙
Is there a way to: start training, pause and later on resume the training?
Thanks for your great explanations, found you by accident. Q ...I want to sample a vocalists voice , as singing and not spoken word. What A! app would you recommend please as I dont think anyone covers this need etc , thanks for any tips
Would love to see a TTS demo rather than STS. Does it handle nuance and variety very well?
THANKS a lot for all the research and effort, appriciate
Thanks for the video. does applio run on AMD gpus too? if yes, do you need to do anything special to get it to work or is it just like in this video?
This video makes it very clear that only Nvidia supports the app.
hey, I have problems with the Polish language because the generator reads a instead of Polish ą. how to solve this problem? what does it depend on? Did I train the model wrong?
Your Spongebob impression was on point! Great video man, you made it fun.
Thanks for this training. On the top left on 00:43 i see Apollo premium on the site. If i download it i dont have that? Do you have more options?
this is pretty rad, wish i knew Applio existed back when you released this video because ive been fumbling with various TTS webuis since last year and none have been the right "it just works" fit. Until now. Thanks, very helpful vid.
mine doesn't make the inference file also knon as the .pth file. Any clue on a fix guys???
Hello my friend! Seems like you've read my mind this morning. I was specifically looking for free TTS platforms for the last 2 hrs with no luck, praying you'll come up with the solution in one of your videos, and there you are right on time. Thank you so much for your content!
Great to hear!
the models tab just doesn't exist. Can you link it in the comments or description cuz it's impossible to find
i LOVE FREE APLICATIONS, YOU DONT KNOW HOW MUCH
I'm right there with you!
@@BobDoyleMediad but tis crap totally fail IF IF its work only for ENGLISH people or language, will it work with URDU language ??? with local accent
@@AaliDGr8 do not redeem!
@@summ.3433 ??
Thank you for this guide! So far I've trained 3 voice models, using it. The first 2 took 2-3 hours, but I've got a fairly slow system. The third one has all the exact same settings as the first 2, but a HEAVY British accent, and a 30-minute audio sample. 11 hours later, it's only on the 21st epoch, and almost 1800 steps. Should I go ahead and stop it? Why in the world is it taking so much longer than the others, does the size of the audio sample make that much of a difference?
I just found out the reason for this, and it's not really either of those things. First of all I didn't realize that training settings were not saved, so I tried to train a new model without "cache dataset in RAM" selected, which made a big difference. Then I realized I had accidentally put 2 MP3's in my dataset folders, instead of WAV files. MP3's do not process anywhere near as quickly, as you might well imagine, due to them being compressed! I had tried to replace one of them and actually put it back into the training folder WITH the MP3! NO WONDER it took so long!
I also no longer had "Overtraining" protection turned on, so those were likely 12 wasted hours anyway.
I've already started on a new model from the ashes of that one, and already it's going 75% faster.
great stuff but any idea why im getting a red error message when i try to do any actions? and how to download voice models? my list is empty only showing big list in TTS
@@paulunreal8526 what is the error message getting and at what point are you getting it?
ok i managed to train my model and using it ) I wish results will be better )
same
I have a question. Can you delete the rest of the trained voice files, exluding of course the pth and index file?
Great videos man. I want to know, is it safe to run? When I am running the .exe file, it is showing a message where it states that I shouldn't run this. How do I?
Thanx for the tutorial, me and my family got a lot of fun with it :)
You know you need to update your video.
Because they changed the page and
I went up trying to click on it's not working
I think there is version which use CPU only although you cant use training feature with that version. Tested and working fine in my AMD laptop
where is the version the supports AMD GPU's? can't find it anywhere. I only just found out i can't train with CPU which sucks!
Thanks a lot sir ! Very appreciable tutorial, and very easy to understand (for a dummy man like me) Greetings from France
Hello, I'm searching for the answer and I can't find it. I wonder if there is a way to somehow scale the gain value of the RVC model when converting a recorded voice (inference tab)? So the model change the original voice for example in 50%
finally! This is the exact tool I need! I'm trying to recreate some decades old forgotten/lost media that will be gone forever unless someone saves it, and this is exactly the tool I was looking for. I have been trying voice clones with text to speech free online with mixed results. I am all for ethical use of AI and will exploit the heck out of it for good ends
Hey bob, when I am in the text to speech mode, my index file contains nothing while all the rest of the drop downs are populated. Would you have any idea why this is happening?
10:14 love it thank you can you tell us please what is the name of video where you talk about it?
Thank you, finally one video that explained in layman terms, most of the videos I have been watching regarding these text to voice software just ending explaining in a way they assume everyone in the world to be programmer. Thanks a lot for the video and the software, no way am I a layman in terms of computers n graphics but not every single designer is a programmer, this video helped a lot.
So happy to hear all of that. Thanks so much for watching!
im using v3 and i dont have any options that you have in download. You're looks way different. Are you using PRO?
How did you generate the Liam Neeson part? The lip syncing was excellent!!!
That's a combination of (at the time) ElevenLabs speech to speech (which you can do with Applio, covered in this video) and then a program called "FaceFusion" for the faceswap. I've covered how it's used in other videos, but haven't yet done a dedicated FaceFusion video - so I guess it's about time to do that. It's also a free solution. th-cam.com/video/20o-5orWrHU/w-d-xo.htmlsi=Tqu_fDvOz43r35tF&t=1036
Can you then make him read text as well? Sort of like tts?
Does any I'd this stuff run on an Apple MacStudio?
i first got to know applio when searching for voice synthesis. then I realized applio is using edge_tts internally, I ended up using edge_tts directly, which is much more light-weight. This time round, I hit applio again searching for voice cloning. Does anyone know what is the underlying library applio is using to clone voice? I'll eventually dig into it myself if I do not get any replies soon. First time watching Bob's video. I feel it's well organized and and condensed with the information viewers want. Excellent work!
The TTS in Applio is NOT running locally. The "predefined" voices are loaded through a Microsoft API and the TTS is performed by them (it uses edge_tts). Keep that in mind when you upload the minutes of your top-secret meeting or when you make Squidward say really nasty things.
yeah this is a huge dealbreaker, are there any other TTS alternatives that use the same model but don't require me to connect to edgetts??
but is the rest of applio still being able to run offline?
Can you please fill in and tell if they are still doing this. Please.
@@blake-ow9mv Yes they are.
Unfortunately....yes ....had already downloaded and run it too
After countless youtube videos i've watched, this one This One! that's trully worked, Thank you so much
SO happy to hear that!
Hello, can the audios generated with cloned voices in this case be longer than 10 seconds, without quality problems?
"Failed to train index: need at least one array to concatenate" i got this message after training my model, anyone knows whats wrong?
I haven't seen that one before. It would be great if someone could assist!
@@BobDoyleMedia solved that problem by clicking the dataset creator tab to load dataset
PLEASE REPLY: will it work without heavy gpu or with nvidia geforce 8400 gs ??
Is there a way to make the tts voice before the rvc better?
The TTS voice is completely dependent on the generated file that is created before the conversion takes place. I would go through and test a lot of those and see if any of them sound real. Otherwise, I would just record my own voice, saying what I wanted to say how I wanted to say it and then convert that.
What tts voice model sounds more natural not synthetic?
Thanks for the video, it was very helpful. I managed to train according to the video and created it. am a model. also created the index file, but unfortunately I couldn't export it because it doesn't see it. You don't see the index file in TTS either. Where can I find the path to the index file. I want to ask for help.
Hi Bob! I just recently became aware of this tech, and I decided to use it to create a model of my own voice. I literally came across this video when trying to figure out how to make RVC work, and I immediatly installed Applio and damn it works so fing well!!!
The reason I wanna create a moadel of my own voice is so that I can streamline content creation for short video format to post online, I did the math and having to record audio for every video would be a pain in the butt, so I figured why not make a voice model and use TTS to easily narrate my scripts. As I'm writing this I'm looking at my other screen and it seems like the thing is being trained, so I cross my fingers. Thank you so much for the help with this, it was super easy to follow!
Do you think TTS is a realiable way of generating realistic narrations? Because so far it sounds a bit robotic. Doing Audio to Audio is not really an option in my case.
TENGO INSTALADO RCV EN MI COMPUTADORA PORTATIL. HE CLONADO MI VOZ PORQUE QUIERO HACER AUDIOLIBROS CON ELLA. PERO NO ENCUENTRO EN DONDE INGRESAR EL TEXTO DE LOS LIBROS PARA QUE SE GENERE EL AUDIO CON MI VOZ CLONADA. PUEDE DECIRME EN QUE PARTE O COMO PUEDO HACER ESO?....
HELP! HOW CAN I USE APPLIO OUTSIDE OF THE GUI! I Need to load models and generate tts via text on some sort of programming language. I cant work with the gui.
Hi Bob. My connection keeps erroring out. Any fixes?
Heck yeah Bob! Been waiting for this!
I had to really scrape youtube to find this video, but MAN am I glad I did. Thank you!!!
its shows error in command line while installing - "module not found "torch"
Thanks for producing these videos. I do have a question. These voice clone services are pretty amazing, but do they all work based on English (or other major known languages) only? As of now, is there a tool available that would basically just mimic a reference vocal track (regardless of the language) and reproduce it with another voice? For example: You did a video where you sang a melody and then replaced it with some male/female AI voices. Does that only work if you sing in English or other officially trained languages ? In other words, does the AI look at the lyrics to reproduce the voice? If that's the case, then for the time being we can only use them for certain languages, right?
Thank you Bob for your very interesting and well-documented videos. Whenever you can, could you make a video to apply this technology to music? Is it possible to use this system to create a model to be used as a singer's voice?
Actually, that’s exactly what most of this is used for. I have several other videos on this. Check my last few post and you’ll see that they’re very music centric.
@@BobDoyleMedia Thanks Bob, but I wanted to understand if Applio can export a model to use with Replay to replace its own voice in a song created, for example, with Udio. If so, could you briefly explain how I can do it?
I got an error when training: ValueError: not enough values to unpack (expected 5, got 3). The .pth file wasn't generated
could you please show me how to install this on a debian 12 bookworm?
@BobDoyleMedia
I would like to say thank you so much for making this video and showcasing this software / AI tool. For the last two weeks, I have been exploring different AI tools to clone voices for a persona project and each AI tool I were experimenting with wasn't working correctly or was simply broken. Applio is one of the most stable tools I've seen during my two weeks and I didn't need to install it via Git installation or other complex means.
The training segment isn't working sadly but I can use RVC itself to train the data and input the final voice model into Applio and it works... Finally!
Again, thank you so much for your support and for sharing this information... I'll look forward of seeing your future videos. Take care!
I always love a free local model. Gonna convert all of the John Scalzi Audiobooks that i purchased from Will Weaton to either Stephen Fry or John Lee.
You are a Legend for creating this! Also you remind me a bit of a favorite actor of mine, Ed Harris!
Hi, will this work for Android phone??
this is a wonderful free piece of software.. i tried it.. anybody knows how to add pauses in tts?
I tried it but when it finished installing Applio the voice model and file index sections were empty while yours wasn't. What's wrong?
When you say "empty" what do you mean?
Good one but not finding option to download windows base application, only the API is available at the site.
@@CRSLifeLens I don’t think that’s accurate. I just followed the link myself and see the download button which takes you the GitHub which has the instructions.
Please reply, what do i have to do to make the ai voice less Ai sounding.
I tried to d/l for Windows and it kept displaying a 404 error.
Can we add emotions to all the voices like happy,sad, angry?
amazing I was looking for this since GPT3 born :)
Do other languages work as well?
sir on voice model and index section it doesnt show me anything alto i downloaded some voices
I would love you use this. But I can't get it to install on any of my PCs.
Excellent video! Thanks for the upload. I figured there was a way to do this without a subscription:)
what are the languages does it support?
an itibarıyla çalışmıyor, yazarın kendı sitesinde model indirme sayfası (not found 404)
Works well for me. I love it. Thanks for sharing
Fantastic tutorial and software.
@@ExUnoInAeternum Thank you!
Was trying to do this on my old Dell Laptop and after the first epoch (of 500) it showed 10:25:00 time passed.... haha
Great tutorial Bob! Have you tried the Voice Training feature (currently in Beta) recently included within the latest version of the Replay AI software? I thought it'll be great material for you to make a video of. Looking forward to watch that tutorial and hear your opinion on its quality if you end up making one. Thank you for this one! Cheers!
I’m actually going to be doing a deep dive on the Replay software with somebody from Replay very soon. We’ll go over every feature.
@@BobDoyleMedia That is super cool Bob! You are the best! Thank you man and cheers to the success of your channel! Looking forward to your next videos!
@@BenjaminTemplar Thanks very much!
my GPU has only 2GB. Can i still use RCV?
is this available for mac?
I'm getting the error "Failed to train index: need at least one array to concatenate" I'm not a tech guy and can't find any explanations that I can understand. I have a NVIDIA GeForce RTX 3050 6GB . can anybody simplify it it for me?
I'm getting that too. no idea what to do about it.
Same here
is there a word/character limit?
The URL does not work. I signed up and can not login.
I love your channel! Keep rockin', please!
Thank you!
Can u make sing the voice with this app please?
So, it runs on windows with GPU but no other OS?!
I have other AI AI apps installed using Python, Conda, Miniconda etc. This one will not install using the .bat file. Miniconda is in the system environment and this app can not locate it.
@@davidpayne5964 it’s been a really long time since I installed it and I’m imagining they probably done some updates, so I don’t have firsthand knowledge with your issue though I wish I could offer some assistance.
how used voice training tts?
Creating your own RVC on a Mac seems so complex still to this day. WHy is Nvidia needed?
Hardware acceleration. GPU does it much faster because of the type of processing it requires. That's a very high level explanation.
can i add my custom trained model in the TTS tool?
Yes, as long as it’s an RVC model.
@@BobDoyleMedia but i will not have the full control of it right? like pauses,emotions etc.
Great video. although, can anyone confirm?: I want to take my voice record it enough times to where it becomes a voice my kids can use to have me read them stories when I am not home. Is this possible? Seems like this app might do this but couldn't figure it out from watching 75% of this video. I thought it did but now I am not sure. I mean, I see apps where you can choose any voice to read text to speech for me but I want to train my voice so my kids can use my voice. How awesome would that be to have your dad read stories to your kids or grandkids long after he is gone!
Yes, in theory you can do that with this, though the text to speech with this tech is not going to have the same level of conversational realism as something like 11 labs. I also just did a video for something called ChatTTS that you might check out: th-cam.com/video/3speJ3ThW7I/w-d-xo.html
I'm not too confident in that Overtrain feature.
great videos. would you please consider creating a video on how to use the downloads section of the UI to download a model of 'Stephen Fry's' voice from huggingface, then use this to read a text file?
I kept getting an error code? Have given up as, “Too good to be true!”
Id love a comfy ui workflow and solution. Anyone have an idea how to get openvoice by hay and if its any good? I almaot made it run but i couldn't get melo tts to install
connection error out, how to fix this?
How to switch from male to female in applio?
there is no dowload link
Incredible, thanks for this
pls can you provide train audio file ( 16 min audio file ) it will be easy to train.
Please tell somone how lines change to remove limit tts is 15 min limit ;/ . I would be very grateful
thanks again i recently downloaded this but was a lil bit let down by some models, but the voice to voice mode is so damn useful! thanks! maybe another video about the best voice models? greatly appreciate your vids!
@@Oxes thanks so much for watching!
Good Video! Than you for sharing!
Does it work in Spanish?