(COLAB PRO ONLY) AI Voice Cloning with RVC in GOOGLE COLAB - Guide and Setup

Jarods Journey

มุมมอง 147 388

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 10 ธ.ค. 2024

ความคิดเห็น • 352

@Jarods_Journey ปีที่แล้ว ⁺⁶¹
CURRENT Issue 9-15-2023: I was made aware that Google has started banning RVC usage on free accounts, similar to what it did to stable diffusion. There is no fix for this ATM other than to get the PRO version of Collab.
IMPORTANT: You MUST click "Train feature index" at 12:07 in order to get the IVF index file you'll need later. As noted by another comment, this can be done before or after training.
Sorry about that guys!
@dylonkejhu ปีที่แล้ว
Thanks !
@jessie24031 ปีที่แล้ว
what is the difference between the two? is there anything different about it?
@ALFTHADRADDAD ปีที่แล้ว
good lookin out
@NamDinh-b3u ปีที่แล้ว
what is the consequence? Is it train model -> train feature index -> one-click training?
@forest1605 ปีที่แล้ว
@@NamDinh-b3u whats the diff between them
@mikecameron2327 ปีที่แล้ว ⁺¹⁶
Thanks for your videos on RVC, they were very helpful to me to get started with this. One important detail, in the video you put up a graphic telling viewers to click the train button, not one click training. This was good because I think one click training is slightly bugged, but you forgot to mention that if you don't use one click training you MUST also run feature training (2nd big button) or you won't have an .index file.
@Jarods_Journey ปีที่แล้ว ⁺⁴
Appreciate it! Man, I had to do a double take real quick, but I do say it to click it at 8:16 😅. I realized after editing that one-click training does all of the previous steps before, so it was redundant to click one-click training. If you run all the previous buttons before and click one click training, it just redos all of the previous steps.
Edit: misread the comment, looks like an oversight and a missed step on my end!
@producer8587 ปีที่แล้ว
It’s now banned tho 😢
@realjgerard ปีที่แล้ว ⁺⁴⁹
I just want to say publicly, that I appreciate you Jarod for creating all of these guides. I can tell that you’re not doing this for views and that you truly have a generous spirit. These videos will create businesses..generate revenue. I explore anyone that does so pays it forward or pays it back. I know I will… ☯️🥂🚀💯
@Jarods_Journey ปีที่แล้ว ⁺⁸
Thought I had left a comment on this one, but going back through comments again, really appreciate it Gerard 🙏!
@hanynagy8969 ปีที่แล้ว ⁺¹
@@Jarods_Journey I need your help please.I just want to Update my trained model,I mean if I want to add more date(Audio files) to the model,Is it possible? Because every time I have new data I make a new model from 0 so I m tiered from that.Thanks in advance!
@zazyczech ปีที่แล้ว ⁺¹
My last programming was in 1997 with basic (i was 12). This is whole new universe. Thank you!
@Gratencya ปีที่แล้ว ⁺²
I've watched a different tutorial which didn't help me at all. Sound was robotic, and generally no much explanations whatsoever.
But this one helped me, and the trained voice sounds perfect! I am extremely thankful for this.
You are the best! :D
@Jarods_Journey ปีที่แล้ว ⁺¹
Appreciate it 🙏!
@idk7440 ปีที่แล้ว ⁺³
thanks for the video bro, i've just spent 5 hours on a model of my friend and it works relatively well without much training, totally worth it
@WIDOMU ปีที่แล้ว ⁺⁶
I just want to say thank you so much Jarod for making this video. I can feel the passion and kindness and you were doing it to help people not taking it for money. It really worked for me the collab I was dancing crazily when It worked. I was so happy. Thank you, I subscribed!
@Jarods_Journey ปีที่แล้ว ⁺¹
Haha appreciate it, glad I was able to help you get it all up and running :D!
@WIDOMU ปีที่แล้ว
@@Jarods_Journey Thank you. All the best. I will be supporting your future videos! Thank you for replying too, It feel great when the author of this video replies to their fans! :D
@klaurcschwackerberg1880 ปีที่แล้ว ⁺⁵
You did humanity a huge favor by making this tutorial , many thanks ! So much detail and well explained ! Liked and subscribed !
@Jarods_Journey ปีที่แล้ว
Appreciate it!
@ИванАленин-и6о ปีที่แล้ว
Man, thank you so much for such detailed guidance! I've watched about 10 videos on the same topic but did not understand the process. You really helped me, thanks!
@PrivatePaul ปีที่แล้ว
10:38 you say (and do) "click one-click training" but you display "do train model". so what is it now? which one is the right one?
@Jarods_Journey ปีที่แล้ว
Do train models then after training, click extract feature index
@PrivatePaul ปีที่แล้ว
@@Jarods_Journey isn't that what one-click training does? (training + feature extraction with one click)
@SantoValentino ปีที่แล้ว ⁺⁵
If anyone is looking to use RVC locally, it’s worth it. Saved me hours of training and it sounds better. Haven’t touched sovits since I installed Mangio-RVC.
I had troubles with installation but after 2 days it was worth it. The CHAT channel in the ai discord helped
@Jarods_Journey ปีที่แล้ว ⁺²
Interesting fork! RVC is definitely worth it and in my experience, it just trains faster and produces better results. I gotta take a look at the architecture to see why lol.
@Cameron787 ปีที่แล้ว ⁺¹
Thanks! Trying to decide if I should go local or colab. Will the local speed depend on the graphics card? I only have an RTX 2070 on my Razor Pro from 3 years back. OK its not that old but might be slower than colab? What are you running it on?
@SantoValentino ปีที่แล้ว ⁺¹
@@Cameron787 shouldn’t be bad at all. My 3060 ran fine. Even if it takes a few minutes this is crazy technology either way lol
@beatzoid ปีที่แล้ว ⁺¹
Trained IVF file didn't appear in log -> me folder. How could I solve this?
@marjanamaan2109 ปีที่แล้ว
thank you so much for full detailed tutorial , after watched so many videos finally i found your video with full detail step by step. thank you so much 🙏
@Jarods_Journey ปีที่แล้ว
Appreciate it, glad I could help :D!
@Somebodythatoverthinks ปีที่แล้ว
This dude is a legend for these videos.
@DuskyRick ปีที่แล้ว
Thanks for this tutorial! I tried doing the RVC on my laptop locally, but it seems like my 1650 ti gpu is not as strong as I thought. Good thing I found this tutorial!
@Jarods_Journey ปีที่แล้ว ⁺¹
Appreciate it! All these "AI" training tools are VRAM hogs, so colab is a great alternative.
@ohmfnx2 ปีที่แล้ว ⁺¹
tip: i use google colab for "Train" but "Model inference" & "Accompaniment and vocal separation" i use on gtx 1650
@lakshit._.sharma._ ปีที่แล้ว ⁺¹
You earned a new subscriber ❤
@Jarods_Journey ปีที่แล้ว
And hopefully I bestowed some new knowledge :)
@dylonkejhu ปีที่แล้ว ⁺²
Hey, at 15:28 i can't find the file in the model folder. Can u help me why
@naturalbest617 ปีที่แล้ว
I have the same problem... any solution?
@mikecameron2327 ปีที่แล้ว ⁺²
If you clicked the "Train" button and not the "One click training" button you have to also click the middle button "train feature index", that's what makes the .index file you need. You can do it before clicking training or after, it doesn't matter.
@dylonkejhu ปีที่แล้ว
@@mikecameron2327 thanks :D
@BeatsAudios1988 ปีที่แล้ว ⁺³
Sir, this is the error i am getting from content data set
Could not parse variable and value from ""/content/drive/MyDrive/dataset/AKALEYO_NEE_vocals.zip"". Expected the line to start with a variable assignment. please help me
@Skylar-333 ปีที่แล้ว
I am having this same issue, I've tried everything, including just naming my zip file what the program seems to be looking for. No idea what went wrong. I don't even have the same opportunity under the "Dataset location" to edit the path as seen in the video. It is just all red saying "Could not parse variable and value from ""/content/drive/MyDrive/dataset/lulu20230327_32k.zip"". Expected the line to start with a variable assignment" and the edit symbols are greyed out and inaccessible. Not sure what has gone wrong, I followed everything precisely. Would love some help! Glad to see I'm not the only one!
@Vateir ปีที่แล้ว ⁺²
Constant connection errors on every step in the public web RVC, for a second they seem to work and then give error messages. I managed to get to feature extraction but each time it just halts the process, says there is a connection error and sometime after colab disconnects
@bluebrun0287 ปีที่แล้ว ⁺¹
Hey! I've been following your tutorials for quite a while, and I must say - they are helping me A LOT. But in this one, I need a little help!
When I run the "start web" code that you show on 6:37 I get the error message saying:
/content/Retrieval-based-Voice-Conversion-WebUI
Traceback (most recent call last):
File "/content/Retrieval-based-Voice-Conversion-WebUI/infer-web.py", line 7, in
import faiss
ModuleNotFoundError: No module named 'faiss'
Have you seen something like this and how we can fix it?
@Jarods_Journey ปีที่แล้ว ⁺¹
Try re-running the cells in the Collab, faiss may not have installed fully or correctly
@KenDoStudios ปีที่แล้ว
7:21 when i get to this step after following the rest correctly there are no files here for me to train
@Jarods_Journey ปีที่แล้ว
Check again at around 5:00 to make sure you mounted your drive and all of the file paths are editted correctly. If you're not seeing any data, then for some reason either the cell didn't finish correctly or there's some type of file path error.
@KenDoStudios ปีที่แล้ว
@@Jarods_Journey yeah i did... the zip is ther and the guy whos helping me is confused about this too
@RobertJene ปีที่แล้ว
I don't care much for google collab, but I'm here for the like, comment, and the view.
@Jarods_Journey ปีที่แล้ว ⁺¹
Appreciate it 🤟
@65536thRoundTable ปีที่แล้ว ⁺³
weird. my comment got deleted. Anyways , put this right at the start of Install dependencies if you keep failing to build pyworld and/or not be able to find faiss
!pip install pyworld==0.3.2
!pip install numpy==1.23.5
@nonnegative7063 ปีที่แล้ว
My comment was deleted too, I sent whole installation line which should've fixed that
@Jarods_Journey ปีที่แล้ว
Yt doesn't like full code comments or links, unfortunately, so it seems like it filters those out.
Appreciate the fix, but we'll have to wait until it's pull-requested over on the repo or updated 🤟
@theAIsearch ปีที่แล้ว
Very helpful - thanks! Whats the difference bw pm, harvest, and dio?
@Jarods_Journey ปีที่แล้ว
Not too sure, but harvest produces the best results in my testing.
There is a difference, I just haven't looked it up extensively lol.
@theAIsearch ปีที่แล้ว
@@Jarods_Journey No worries, thanks! Do you happen to know what "Search feature ratio" does? I tried setting it all the way left & right, without much difference
@Jarods_Journey ปีที่แล้ว
@@theAIsearch I think the affects accent of the voice so if at 1, it may retain more of the accent
@suga_candy_g7338 ปีที่แล้ว ⁺²
Thank you so much for this helpful tutorial! It's very detailed and easy to follow along with. I trained my model with the default total training epochs(7), save frequency(5), and batch_size for every GPU(7). The result has some static. I plan to add more samples to train with. Can you recommend the number of epochs I should do for the best results? Or is it like the higher number the better type of thing?
@Jarods_Journey ปีที่แล้ว ⁺¹
Appreciate it! It's all data centered and a case by case basis, so really hard to generalize
A good rule of thumb though is quality over quantity. If you can get 1 hour of high quality audio samples vs 10 hours of mixed quality, I would just save the time and do the 1 hour of samples (faster and higher quality). Then you can run it for more epochs if you wanna try and smoothen out the noise, or add more high quality samples (though a lot will be experimentation and seeing what works best for your voice!)
@suga_candy_g7338 ปีที่แล้ว ⁺¹
@@Jarods_Journey Thank you for your help. I will give it a shot :)
@theentirecircus6623 ปีที่แล้ว ⁺¹
Great Tutorial. I'm having some problems with the last part, even after using a 2 min. short audio (inference) I'm getting the timeouts and there is also no folder with the name Gradio in the TEMP folder. There are only couple of INFO messages in colab and nothing else.
@Jarods_Journey ปีที่แล้ว
Hmm, not sure what's happening here, it might not actually be processing then. Have you tried restarting runtime?
@theentirecircus6623 ปีที่แล้ว ⁺¹
@@Jarods_Journey actually not, I only tried rerunning the cell. I'll try restarting the runtime as well, then post the results here
@obeyoutube ปีที่แล้ว
@@theentirecircus6623 hi ! I have the same issue. I've restarted Runtime a few times but it doesn't help. My TEMP folder is empty. Did you solve the problem?
@theentirecircus6623 ปีที่แล้ว ⁺¹
@@obeyoutube I've just tried again with another notebook and it worked (I've waited couple of minutes after getting the timeout error). @Jarods_Journey I can link the notebook if it's okay, but it's not from the original repo
@hedwig7s ปีที่แล้ว
@@theentirecircus6623 Link it please
@8gntt ปีที่แล้ว ⁺¹
I had an interesting error come up when I started feature extraction.
This is what I was met with at the end
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py", line 309, in _conv_forward
return F.conv1d(input, weight, bias, self.stride,
RuntimeError: Calculated padded input size per channel: (1). Kernel size: (2). Kernel size can't be greater than actual input size
all-feature-done
@Jarods_Journey ปีที่แล้ว
I believe I saw this error somewhere in the GitHub page, you might wanna check there on the RVC issues tab
@kazamify ปีที่แล้ว ⁺¹
I don't see anything under the inferencing voice tab. I have refreshed the voice list. I do see the index logs file under the "path to the .index" file tab, however. Process data step ended with "end preprocess" and the feature extraction step ended with "all-feature-done". Can anyone help?
Thanks for your content, Jarod.
@Jarods_Journey ปีที่แล้ว
I might need a bit of clarification, but I believe you have everything finished that you need to train. Once the training is done, in the inference tab, you gotta click the refresh timbre button.
One additional thing is check to see if the training outputted a .pth file in the weights folder, if there's nothing there, the training didn't finish correctly!
@kazamify ปีที่แล้ว
@@Jarods_Journey Yeah I when I check the weights file there isn't any .pth file. It is weird.
Anyway, I watched your Google Colab tutorial for SVC and I managed to make it work! Thank you so much, mate. Keep it up.
@CosmotheDane ปีที่แล้ว ⁺¹
Always appreciate your videos. Question: is it possible to implement a additional rounds of training on previously trained models? I know you run another round of training on the model you create, but how would you get it to recognize a model that's already be created after having completed and closed out the process? would you just re-insert the .pth and index files back into the weights/other folder and run a new training process on a new dataset?
@Jarods_Journey ปีที่แล้ว ⁺¹
Hmm, are you talking about adding additional training to another model, say you downloaded it off the web and you want to train it with other data? Unfortunately, I think you would run into some issues with this, but it's 100% possible as the pth files are just parameters or weights for that voice. However, even if you re-instert the .pth files in their respective folders, I believe you also need all of the other parts of the training process as well which includes the previous processed logs and all those other things.
Never hurts to try though! I've never done it myself for rvc xd.
@helmutroll4773 ปีที่แล้ว ⁺¹
@@Jarods_Journey I guess he means if you have already started a training session with let's say 20 epochs. And then you realize you want to have actually 100 epochs. Can you somehow continue training on the previous model and then just add the missing epochs from 20-100 or do you have to start all over from 0 again? Thanks for this incredibly helpful video!
@CosmotheDane ปีที่แล้ว
@@Jarods_Journey yeah, so to simplify, can I have multiple rounds of distinct trainings (i.e., dataset A, dataset B, dataset C) and have the same model retain what it learns from one to the other, or can you only train a model on one dataset, even though there may be multiple rounds (i.e, dataset A1, A2, etc...).im just trying to make the most banging model I can, if ya feel me
@HR-zg9ci ปีที่แล้ว ⁺¹
Can you explain what the "dataset" at the beginning is for? Do I need this to define in the google colab section or is it just the same like defining the folder in the training tab under step2a > "input training folder path"? Thanks for the great content, it's a big help!
@Jarods_Journey ปีที่แล้ว
dataset contains all of the data to be copied over to the colab repo, to then get trained on. You'll just set all of the things as shown in the vid for paths.
@_nothinghappens1548 ปีที่แล้ว
Hey! At 14:05 when I click on refresh the vocals, mine doesnt get refreshed. More files arent appearing like in the tutorial and im also getting another error besides the me_zip. I also get a "/content/drive/MyDrive/dataset/.ipynb_checkpoints: Is a directory" error. Can someone please help me?
@Jarods_Journey ปีที่แล้ว
Do you have any files that are .ipynb_checkpoints in your datasets folder? Not too sure unfortunately if files aren't appearing, could be possible some parts of the process never finished.
@OmishaJain-u8j ปีที่แล้ว ⁺³
hi the video is so straight but i have an issue i dont have the "trained_IVF201_Flat....." but i have train.log what should i do?
@nightknight8651 10 หลายเดือนก่อน
I have to say your tutorial is very useful so thank you for it but I have a question
the colab notebook seems to run an older version of RVC where there was no rvmpe
so how to run the latest version on colab?
@kingpinoftherails926 ปีที่แล้ว ⁺¹
Great video, thanks for that.
I saw in the description of the Voice Conversion WebUI that there is a link to hugging face space.
Will you also make a video on how to get this to work there? Many thanks in advance.
@Jarods_Journey ปีที่แล้ว ⁺¹
Appreciate it! Unfortunately, it's not hosted on hugging face and when you click the button, it just brings you to the page that you would need in order to manually install the models needed to run it locally
@SKYGGEMUSIC ปีที่แล้ว
@Jarods_Journey I have a question about the dataset that you used to train the main model. Is it big? How big? I train french voices and they sing with an english accent! Cute, but sometimes weird!! I was wondering how to have a big french model and how many data was needed to do so. BTW huge congrats, your work is amazing. Love it.
@Jarods_Journey ปีที่แล้ว ⁺¹
Appreciate it! The datasets I use range from 10 minutes to 3 hours, but my average is around an hour of audio. In my experience, the index file controls the "accent" so you can try adjusting it to 1 and seeing if it results in a better accent
@SKYGGEMUSIC ปีที่แล้ว
@@Jarods_Journey that's correct, english accent has almost gone. thx
@KeizerSinbad ปีที่แล้ว ⁺¹
I understand what everything here is except for what the MP4 file is, and is for. Can you elaborate on that?
@KeizerSinbad ปีที่แล้ว
Ah nevermind. I understand. I was wanting to use this to do the realtime with my microphone. You wouldn't need the MP4 file for that I guess.
@Sahgee ปีที่แล้ว
Your video was so helpful!!! I wanted to ask if we come back another day to use our model, after loading her up on colab, do we have to train her again or can we just jump into the model interface step?
@Jarods_Journey ปีที่แล้ว
As long as it's trained, just inference :)
@JosGandos685 ปีที่แล้ว
Thank you so much mate. But I've got a situation here. after accompaniment and vocal separation and press CONVERT, I always got Connection errored Out. and can't convert. what is that? what i gonna do?
@jaripeltola ปีที่แล้ว
The step 2a returns an error in loading audio in the webUI. The setup and folder paths are correct.
@jaripeltola ปีที่แล้ว
The same audio files load correctly in the offline version, but there I cannot train a model without GPU.
@Jarods_Journey ปีที่แล้ว
Make sure there are no space in the path. There may be other stuff but this is usually the fix
@Sahgee ปีที่แล้ว
[Edit: I searched the comments and found your response to someone else! Youre the best. Patiently waiting for my epochs to finish :) first time ever doing anything like this and im honored to have found/used your help)
Original: This was such a wonderful video. My issue with the training is that it gives me the error message "filenotfound error: [erno2] no such file or directory: pretrained/f0G40k.pth" any help with this step will be greatly appreciated. It is the default load pre trained model option in step 3.
@kurotesuta ปีที่แล้ว ⁺¹
Is there TTS for RVC?
@EdoVro ปีที่แล้ว
This is really cool. Tho one thing, trying to download the multiple audio files at once completely crashed my Chrome. Luckily the Colab session was still going so I didn't lose anything.
@Jarods_Journey ปีที่แล้ว
Yup, it'll stay active for 12 hours as that's the runtime length unless you deactivate it!
@Prodigy-Chaos ปีที่แล้ว
So do you train like for example 1 1/2 hour of your own voice with the harvard sentences(or maybe something better I'm unaware of) and then use 1 1/2 hour of a mp4 vocals or whatever voice you're trying to clone to go into the realtime AI voice changer?
@Jarods_Journey ปีที่แล้ว
You can do that, but the voice samples don't matter as long as they're complete words and there's no background sound. Once an RVC model is trained, that is what goes into the voice changer/
@RobertJene ปีที่แล้ว
18:00 - that would be a good glitch voice for different Sci-Fi effects!
@Jarods_Journey ปีที่แล้ว ⁺¹
Most definitely 😂, I'm sure its highly possible to find some use cases for it
@KarthiKeeran 7 หลายเดือนก่อน
Hey, everyone. I've trained a model with 400 epochs with voice samples of 10 min audio. When i'm doing the voice conversion, the words are not even pronounced correctly. The sound looks more like humming instead of speaking. What am i doing wrong? Appreciate your help.
@Hibabiii ปีที่แล้ว
how to continue training my model ... I had trained it for 200 epochs ... but it still not that good ... Is there any way to continue training it ... Or should I train a new model ???
@tomtornados6236 ปีที่แล้ว
The python console in the Collab is throwing me errors about non-existent modules whenever I click on the "start web" cell. Why is this?
@eventfakt ปีที่แล้ว
Hello, when I use collab, the most time-consuming connection is suddenly disconnected, and when I try to connect again, it does not work at all. This issue has been happening to me for several days, please give me a solution so that I can use it again.
@lanhoyc4435 4 หลายเดือนก่อน
Hi, i see error " can't open file '/content/Retrieval-based-Voice-Conversion-WebUI/infer-web.py': [Errno 2] No such file or directory" at start the web step. How can I fix it?
@randylanphear 3 หลายเดือนก่อน
i get the same error... have you found any fix?
@singlewave805 ปีที่แล้ว
I dont have the window for dataset location...
@next3108 ปีที่แล้ว
hey i got error on the latest step model inference, first fail in console say RuntimeError: Failed to load audio: ffmpeg error (see stderr output for detail), and second this AttributeError: 'NoneType' object has no attribute 'dtype
@Jarods_Journey ปีที่แล้ว
Make sure there are no spaces in your folders or anywhere in your folder path
@next3108 ปีที่แล้ว
@@Jarods_Journey there are no spaces, again error
@obeyoutube ปีที่แล้ว
Hi! Thanks for the video. It's very useful. I faced an issue during the process. I trained a model and it appeared in the model inference. However, when I wanted to copy the path to my index file, it wasn't there. On which step does this file have to appear in the folder logs/model_name? I didn't extract vocals from a video as I already had an audio file. What should I do, if I have the trained model and there is no index file? Please help me with this issue. Thank you.
@Jarods_Journey ปีที่แล้ว ⁺¹
Check out the pinned comment!
@obeyoutube ปีที่แล้ว
@@Jarods_Journey thank you for peeling my eyes =)
@cryptidpet4325 ปีที่แล้ว
what happens if the load package dataset doesn't work? it didn't work for me and im unsure why
@cryptidpet4325 ปีที่แล้ว
NO I FIGURED IT OUT, I DID NOT NAME MY DRIVE FOLDER DATASET
@deepsacheti742 ปีที่แล้ว
Hi man! Just a query! I have trained my voice model for narration yesterday. Now, I would like to convert a TTS voice to my voice. I followed your instructions restored the path before going to web interface. Now, as you mentioned we have to put the path address of IVF file in database file path but now, I only see G and D file of my model when I am going to RVC - logs - my project. Can you please help on where to find the ivf file when we are coming on the next day.
@Jarods_Journey ปีที่แล้ว
You want to grab the weights file, inside of assets/weights. That is your voice model. Inside of logs/ is you're index file
@animeshindia ปีที่แล้ว
Please Make a tutorial on mixing two rvc models using ckpt.
@sonidojamon ปีที่แล้ว
Which one do you think is better for singing voice, so-vits-svc (and fork version) or RVC?
@Jarods_Journey ปีที่แล้ว
RVC
@sonidojamon ปีที่แล้ว
@@Jarods_Journey Nice! have you tried the mangio-RVC-fork?
@megagamer2874 ปีที่แล้ว
Is it possible to do this on the windows terminal because whenever I try to do it, I keep loosing connection so is there a more permanent solution?
@rzxv3 ปีที่แล้ว
You should cover the new mangio-crepe fork, it’s way better at making models and has RVC v2
@ridzverse ปีที่แล้ว
i've trained a model and didn't give me the index file, how to fix it
@Jarods_Journey ปีที่แล้ว
Click the train feature index button
@pujabanchu8239 ปีที่แล้ว
All process of training complete but in Weights i can not see anything (.pth file)....why?
@Jarods_Journey ปีที่แล้ว
That means there was some type of error in the training process and it never got to outputting a final trained model. In this case, you'll have to run training again for the model
@nick22552 ปีที่แล้ว
Traceback (most recent call last):
File "/content/-EVC-/extract_feature_print.py", line 13, in
version = sys.argv[6]
IndexError: list index out of range
['extract_f0_print.py', '/content/-EVC-/logs/Cristi', '2', 'crepe', '115']
how do i fix this
@dusandss 8 หลายเดือนก่อน
Hi Jarods, I am using Colab Pro, and didnt have issue at first, but now everytime I want to train model, it always shows me error, because two log folders are missing: 2a_f0 and 2b-f0nsf. If I add them manually, training will proceed to the end, but they will remain empty and my model won't change voice on cover song successfully.
I'm trying to solve the issue past three days, but without success.
Can you help me with that? Why are these folders missing now, and not before, and how this can be solved? Am I doing something wrong?
Thanks in advance!
@MistahJ100 ปีที่แล้ว
I install all of the cells and when i click on the web Portion it says Python 3 cant open file or directory, What am i doing wrong?
@everythingisgame47 ปีที่แล้ว
When i want to make a model crepe is Best or harvest?
@MrH4nky ปีที่แล้ว
So, theoretically, if it disconnects because of 12 hours period, how am I supposed to finish extraction and training faster? Or there're some checkpoints for data, so I can keep going from the previous one?
('Cause it threw me error at first try and after that collab didn't want to start again)
@ShortStories-el5sw ปีที่แล้ว
how do we train more than 1000 epochs for the model?
@reruarikushiteru ปีที่แล้ว
06:45
Instead of the link I get an error
Traceback (most recent call last):
File "/content/Retrieval-based-Voice-Conversion-WebUI/infer-web.py", line 87, in
class ToolButton(gr.Button, gr.components.FormComponent):
AttributeError: module 'gradio.components' has no attribute 'FormComponent'. Did you mean: 'IOComponent'?
So I guess that's where my attempt ends
@Jarods_Journey ปีที่แล้ว
There's an issue with the latest GitHub repository, seems like it broke the colab version.
You might wanna check the fix that people said about here: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/549
@reruarikushiteru ปีที่แล้ว
@@Jarods_Journey Thanks, it worked!
@echofloripa ปีที่แล้ว
I'm running on pro version, I'm getting the following error:
I added a single .wav file to the /content/dataset-2/ folder inside my collab
start preprocess
['trainset_preprocess_pipeline_print.py', '/content/dataset-2', '40000', '12', '/content/Retrieval-based-Voice-Conversion-WebUI/logs/me', 'False']
/content/dataset-2/.ipynb_checkpoints->Traceback (most recent call last):
File "/content/Retrieval-based-Voice-Conversion-WebUI/my_utils.py", line 14, in load_audio
ffmpeg.input(file, threads=0)
File "/usr/local/lib/python3.10/dist-packages/ffmpeg/_run.py", line 325, in run
raise Error('ffmpeg', out, err)
ffmpeg._run.Error: ffmpeg error (see stderr output for detail)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/content/Retrieval-based-Voice-Conversion-WebUI/trainset_preprocess_pipeline_print.py", line 75, in pipeline
audio = load_audio(path, self.sr)
File "/content/Retrieval-based-Voice-Conversion-WebUI/my_utils.py", line 19, in load_audio
raise RuntimeError(f"Failed to load audio: {e}")
RuntimeError: Failed to load audio: ffmpeg error (see stderr output for detail)
/content/dataset-2/record.wav->Suc.
end preprocess
@Jarods_Journey ปีที่แล้ว
The preprocess seems to have worked, but make sure there are no other files in there and no spaces in your path. This is usually the fix that I see for people.
@BynxFPS ปีที่แล้ว
Do you have to train it on multiple voice sample files, or if I had say a 5 min compilation of voice samples in 1 file would that work as well?
@Jarods_Journey ปีที่แล้ว
You gotta split long audio up as 5 minutes will cause out of memory issue for any consumer level hardware.
@Quisspo ปีที่แล้ว
for me it worked well even with one single 15min file, so dont worry
@alialqarni9733 ปีที่แล้ว
I have a fully completed model with 270 epoch but couldn't build on that to have more epochs ?
If you could help me how can I do that ?
@Jarods_Journey ปีที่แล้ว ⁺¹
Sorry looks like my original shorts response was for local installation. For colab, I'll have to get back to you on it.
@alialqarni9733 ปีที่แล้ว
@@Jarods_Journey thanks in advance ..
@naminhtien862 ปีที่แล้ว
What is the difference between 2 audio.wav file in TEMP folder and why we need to wait both of them?
@Jarods_Journey ปีที่แล้ว
Those are the outputted files from the webpage. If you don't download them, they get deleted.
@The_Quipt ปีที่แล้ว ⁺¹
Does anyone know how to stop the voice changer from saying what I say twice?
@Nabuuug ปีที่แล้ว
Wait, just to be sure, is the "vocals" part for singing or something like that ? If I'm only interested in voices that talk, can I skip this ? Because i don't have singing recordings of the voice i want to clone, i only have the voice samples in the zip file where the voice is talking.... I'm confused
EDIT : AH OKAY, yeah so the video file is not a voice sample to be cloned, it's a voice that we want to replace with the voice from our trained model!! Ok I get it now, so yeah because I want to do the microphone real time thing of the other video, I don't care about inferencing my trained model on this additional video file like you do. Good.
@helmutroll4773 ปีที่แล้ว
Question when executing "restore pth from google drive": After re-importing from GDrive to goolgecolab the *.index and the *.npy file are uploaded into the "content" folder of googlecolab. But is this the correct folder where those file should be in the end? Because afterwards when I am in the "model inference" tab, I have to choose then the "Feature search database file". I should then link to the *.index file which is now lying under "content", is that correct?
and: how many epochs do you think are perfect? is it possible to overtrain and get bad results when putting the epoch too high? let's say eg. 200?
Thank youuuu!
@Jarods_Journey ปีที่แล้ว ⁺¹
Yes, when you reupload back to colab, you have to copy all of the correct file paths to the spots where they belong at. There is no perfect epoch count as it is dependent on data, I haven't seen any instance of overtraining and it's not that severe when training on these small datasets.
@ettiennelane9173 5 หลายเดือนก่อน
Wil it work with the Google Colab Pay As You Go options?
@lanhoyc4435 ปีที่แล้ว
I 've done exactly as your guide, but when i hear the output, it's still the same vocal as inputted. Can you help me out here?
@Jarods_Journey ปีที่แล้ว
I haven't seen this issue before, does it happen on all voices? If so, could be an RVC bug that you might want to repot to their github page
@lanhoyc4435 ปีที่แล้ว
@@Jarods_Journey Thank you, I've checked it again and again, and then I found my flaws. Also, I want to ask, how can I train the bot more if I come back on another day? How can I use the last trained version instead of starting training all over again?
@Jinx_806 ปีที่แล้ว
Hey man help i was stuck on start web...
can you help for the error
/content/Retrieval-based-Voice-Conversion-WebUI
python3: can't open file '/content/Retrieval-based-Voice-Conversion-WebUI/infer-web.py': [Errno 2] No such file or directory
@Jinx_806 ปีที่แล้ว
Got it
@supportteam8263 ปีที่แล้ว
How
@Jinx_806 ปีที่แล้ว
@@supportteam8263 instead of using it in local machine use google collab
@WS48L ปีที่แล้ว
when I tried opening the 'temp' folder nothing was showing
@Jarods_Journey ปีที่แล้ว
May mean that it never finished, could be a recently introduced bug or something changed if this is the case
@WS48L ปีที่แล้ว
@@Jarods_Journey alright ill try it again later thanks
@krishnavamshithumma7377 8 หลายเดือนก่อน
@Jardos_Journey While I'm in Accompaniment and vocal seperation. I am not able to get any model to select like (H5 model). Can anybody please help me find my error.
@TheDenVx ปีที่แล้ว
Is there any way to move GPU work for AI voice changer in to Google Colab? Cuz im getting 50-60% usage on RTX 3070 with Voice Changer 😂
@daniellewis6228 ปีที่แล้ว
14:44 - inference
@gsharks3333 ปีที่แล้ว
Im receiving this error
ERROR: Failed building wheel for pyworld
Failed to build pyworld
ERROR: Could not build wheels for pyworld, which is required to install pyproject.toml-based projects
any idea why?
@cryptidpet4325 ปีที่แล้ว
OKAY- would if there's NOTHING in your TEMP folder when getting an error on gradio????
@Jarods_Journey ปีที่แล้ว
Not sure, some error occured on the web interface, might wanna check the github issues to see if anyone else got this
@forest1605 ปีที่แล้ว
ive got the same issue, did you let google collab seperate the vocals from the audio, or did you get another site to seperate the vocals
@cryptidpet4325 ปีที่แล้ว
@@forest1605 no I seperated my vocals from an audio from a differ site
@forest1605 ปีที่แล้ว
@@cryptidpet4325 that might be why. Im not entirely sure but when I did it with getting the vocals seperated using another site, it didnt work. But when I just used teh vocals that werent seperated and let google collab do it then it worked
@RakeshKumar-hg1ln ปีที่แล้ว
I am training for 300 epochs but after 20 epochs it shows connection error out.
What to do
@Jarods_Journey ปีที่แล้ว
You would have to restart everything, however, I think Google is starting to cut down on free usage of Collab...
@AIAsiaSinger ปีที่แล้ว
I successfully trained a model with 20 epoch. When i try to train it another day with 200 epoch, just right before it generating the index file. The whole colab page crashed and the gradio page shows “Error: Connection errored out” .
why did that happen :(? thanks in advance!
@Jarods_Journey ปีที่แล้ว ⁺¹
Not sure what might be occuring here, but some error must be happening causing the console to stop. Might have to rerun or show what error occured on the cell
@AIAsiaSinger ปีที่แล้ว
@@Jarods_Journey thanks for the reply, i think it's because some data sample size too big. solved.
@BeatsAudios1988 ปีที่แล้ว
in final stage it showing connection error out popup message bro. morethan half hour waiting
@eliasdaviddiazfrancisco5341 ปีที่แล้ว
Nice, but the index file does not apear in logs, I the only problem I have had
@Jarods_Journey ปีที่แล้ว ⁺¹
Try clicking the middle button that trains feature index
@senerio2124 ปีที่แล้ว
Is there somewhere we can download voices that others have already trained?
@zacharyreid7557 ปีที่แล้ว
i was getting lots of bugs on my windows laptop with amd hardware, so this is a good alternative
@VinixTKOC ปีที่แล้ว ⁺¹
Until yesterday it was working, now it isn't working anymore. The main errors that I noticed:
"error: subprocess-exited-with-error"
"Building wheel for pyworld (pyproject.toml) did not run successfully."
"Building wheel for pyworld (pyproject.toml) ... error"
"ERROR: Failed building wheel for pyworld"
"Failed to build pyworld"
"ERROR: Could not build wheels for pyworld, which is required to install pyproject.toml-based projects"
"ModuleNotFoundError: No module named 'faiss'"
@TheBestKindOfFailure ปีที่แล้ว
That issue arose for me around the exact same time. Cannot get it to build that wheel all of the sudden, but it's not a ""ModuleNotFoundError" with me.
@Somebodythatoverthinks ปีที่แล้ว
How long should one wav voice sample be for training?? And should it be isolated vocals???
@Jarods_Journey ปีที่แล้ว
Isolate vocals and each audio sample needs to be 10 seconds or less.
@forest1605 ปีที่แล้ว
@@Jarods_Journey do u mmean 10 seconds or more
@naveenkumar2234 ปีที่แล้ว
Can I use on My Mac running, Os Mojave?
I m using Imac
@riniraw ปีที่แล้ว
First of all, thank you for the video.
some questions:
1. I can't download files from google colab, and I have no such icon in my chrome browser.
2. when I try to restore the paths of my models, it says that the directory couldn't be found... I'm not sure why. everything seems to be correct.
3. how do I take the model that I've trained in google colab and run it localy? where do the model files need to go?
Thanks again :)
@Jarods_Journey ปีที่แล้ว
1. Not sure what's happening, right clicking on the file and downloading can work in this case
2. Make sure the path is being entered in correctly and there are files in the google. If they did not export, this may be the cause.
3. To run local models, you might wanna see the local version of this tutorial to get insight on that process
@gmod92 ปีที่แล้ว
Is there a way to use the model you train here in a normal text to speech engine? If yes, can you point me in the right direction for that?
@Jarods_Journey ปีที่แล้ว
Sorry, unfortunately not. You'll wanna check out my videos on tortoise TTS to RVC pipeline video
@trush1090 11 หลายเดือนก่อน
I trained it for 200 epochs, how do I reuse that model to train it to 400 without having to start it from 0 epochs again?
@elntslents6529 ปีที่แล้ว
Why i can’t find traind vfi 201 11:24
@Jarods_Journey ปีที่แล้ว ⁺¹
This is missed in the video, but check the pinned comment of that video!
@elntslents6529 ปีที่แล้ว
@@Jarods_Journey okay but what if i already separate the vocals from website called vocal remover when i hit refresh it gives me error 14:50
@Not_AnubhavXD ปีที่แล้ว
Hlo its says file not found error at pretained / f04D0k.pth pls solve my problem 😢
@1000trilliondollars ปีที่แล้ว
Hello . Thanks for sharing very helpful video. I follow your every step very seriously. I tested the training with prepared dataset consisting of two files ".wav" (human voice A). Total time is 8 minutes. The total training time I observed on Colab is only 10 minutes, which is very fast. After converting I see the best results are from female to female (quite similar to A voice about 70%). And from male to female, it is very bad. I want to ask: how many minutes should the total time of the dataset be to have the best effect? Thank you .
@Jarods_Journey ปีที่แล้ว
You may want to train longer, but 10 minutes of high quality data is good to start with, but you might want to add more if you're seeing inadequate results.
@gkiss2030 ปีที่แล้ว
@@Jarods_JourneyI think it makes no sense to record only about 1 minutes for the dataset and then copy that 10 times over, right? :)
@ZaiMatoro ปีที่แล้ว
Hi, your tutorial helped me a lot (i just need to find a way to get the perfect model training to avoid all those "glitch" stuff, i tried to set the epochs on 200 actually).
The thing is, can i download the voice model (or is the files ado-japanese_D_935.pth, ado_japanese_G_935.pth and ado-japanese935.pth + the added_IVF file and total_fea.npy) to then use the model on the RVC-web (the .bat method) and to also be able to use the voice model on other websites / software?
@Jarods_Journey ปีที่แล้ว
Appreciate it! So I'm slightly confused with your question, but if you want to be able to use these models anywhere, you need the one that's in the weights section, and the trained index file in the logs section. You don't actually need the G/D ones in RVC except for training and doing training at a later time
@NamDinh-b3u ปีที่แล้ว
why did I just set the epoch to 100, but it appears G and D files have up to a few thousand? which one is better?
@Jarods_Journey ปีที่แล้ว ⁺¹
The RVC epoch is actually weird in the way they named it, 100 actually means 10k epochs. The G and D files have a few thousand due to this and are named for the step count it takes to go through the data. If you're familiar with sovitssvc, this might make a little more sense. You use the G for any type of inference.

ต่อไป

เล่นอัตโนมัติ

Realtime AI Voice Changer Using RVC (Retrieval-based Voice Conversion w./ w-okada)