I dunno if you can add subtitles or notes to a video that already exists but... KimVocal2 + VR DeReverb + VR DeEcho Aggressive + VR 5HP Karaoke work amazingly to isolate the main vocals. I can't seem to use ensemble to do all 4 simultaneously to get one output (I can do 2 at a time, and I get multiple outputs rather than an output that used all 4 of the above effects) A good tip I don't think you mentioned that is awesome is that you can load as many audio files as you want, so if you're taking a voice from an album you can load the entire album in and apply kim vocal 2 to all songs, then apply dereverb to all the kimvocaloutputs, etc. (I'm using UVR5 separately, it doesn't seem to work in my RVC gradio)
hey sorry for replying 4 months later, but do you perhaps remember how many sound files you were using when you did this? I used around 370 (less than 10 sec) clips of me singing and set epochs to 1000. Left it for the whole night and more and it's still "processing" (9+ hours). That's normal right?
@@magmad1ver yea bro I completely gave up on ai voice models stuff a while ago. To much bs to work with and no one giving answers on how to make it work🤷🏻♂️
Fantastic tutorial, straight to the point with things. Now I'm just waiting for the training to end. 15 minutes of cleaned up voice, 500 epochs been 24+ hours 😅😅 Massive EDIT (If you are facing some issues like thinking the training is taking too long/stuck/suspended) At first I thought the training just needed a long ass time. I left it to "work" for over 48 hours, only then did I check the command line, where it clearly stated that it was still at epoch 13, after this long, the program must have bugged out or something. What I am doing now is saving a checkpoint every 5 epochs instead of 50. Since my first attempt did not even reach 50, it did not save, and I had to restart my model from scratch. The point of this post is that YOU CAN RESUME training a model anytime you choose, as long as you have those checkpoints. (I did not know this, which made me hesitate to turn off my PC or check the command line before) i.e. I want to train my model for 300 epochs. If it bugs out / stops for whatever reason at epoch 273. Stop the program, and restart it, using the same parameters you input when first training the model. It will load up the checkpoint at epoch 270, since I am saving a checkpoint every 5 epochs, so the last checkpoint would be at 270. I noticed it would bug if I use too many threads, and a large batch file size. For my specs 9 threads, and a batch size of 5 is optimal. 1 epoch would take around 4.5 minutes Trained a model for 300 epochs using around 50 minutes of sliced up audio. It took around 23 hours. Specs for those who are curious: i7-8700 GTX 1080 Using 8 threads to process
A note for everyone using this and getting an error, make sure there are NO SPACES in the names of your speaker, folder, etc! I don't know why it worked for p3tro and not for me, but this stumped me for a sec
Can you make a video on how to continue training the model. Covering things like adding more audio files to the dataset, and running more epochs on the model. Thanks!!
for those of you having trouble, he skipped over a bunch of important shit and got a some things wrong. First off, you can't load those pth files (he shows that it generates in that folder) in a voice changer client, you have to load the one in the "weights folder.." you'll notice it's actually named what you named the experiment... second, when you click "train model," that doesn't generate the index files. You can either click "One-click training", or train it, then click "train feature index." Some more tips, I've had great quality using 700-1000 epochs and 10 minutes of clean audio separated into 10 second chunks. Don't use version 1 like he does, use version 2. Using more than 10 minutes of audio seems to slow the training by a lot and doesn't help with quality much. You can't use too many epochs though, it just won't get any better past a certain point. RTX 4070 12 gigs Vram took about an hour and a half to do 700 epochs.
@@ThyronixHi so after training we use the path file in weighs folder? And for index you said to click train index to get the index file? Where would this file be located once completed ?
Thank you so much for this video! I spent days researching on TH-cam on figuring out how to generate and train AI voices because I wanted to create my custom made voice, but this one had the most clear and straightforward instructions that finally got my project to work. I almost gave up until I saw this video ngl lol
Hi P3tro... I was wondering if it is possible for us to train a model, then stop... then continue the next day... I was aiming for 1000 or more epochs. I guess my question is, does training have to be done in one go? Or can you stop and continue? Thanks P3tro!
"Hello. Today I created my first model thanks to this tutorial. I used a 5900x processor, an RTX 3060 12GB graphics card - as training material, I used 25 minutes of an interview, resulting in around 150 samples of about 10 seconds each. The model took about 7 hours to train with 500 epochs. Unfortunately, I didn't notice an option to stop the training anywhere."
Hey guys, I'm running Step2a of training in the RVC webpage under "train" and I keep getting the error "FileNotFoundError: [Errno 2] No such file or directory: 'C:\\RVC-beta\\RVC-beta0717/logs/kara/preprocess.log'" - I copied and pasted the path of my vocal sample (folder is called "KaraVoice") and this error keeps coming up. I move the folder and copy the new path and the error still comes up. Do I need to do some sort of preprocessing?
Could anyone out there help me please? When I click on "Train Model", this message appears: 训练结束, 您可查看控制台训练日志或实验文件夹下的train.log (Translation: After training is complete, you will be able to view the console training log or train.log in the experiment folder) Does this mean it is training the model? Because nothing appears in the console.
Is there a way to do this on CPU-only (AMD device)? I tried setting the GPU argument to blank, but it's just frozen in the console on the python runtime arg
with the dataset, whats the rule on the clips, and the ten seconds, is it strictly 10 seconds, should i go 8 seconds if the last two seconds are scream where the first 8 were sung.. should i aim to divide the different vocal tones or ways of singing the singer can/does sing clip to clip or should i just cut a studio vox into 10 sec clips til the song is over.. Looks like it's kinda answered for me, UP TO ten seconds, but still keen to know if there is a benefit to cutting and sorting the vocals based on tone, sing, scream, that kinda thing..
When I go to logs folder, I dont have all of those audio files, I just get a preprocess.log file, even though I got all "success" in the command prompt and "end preprocess" lol.
@@matttdk4888 Yes, one of my files; for example "audio samples" had a space between the words, so I just changed it to like "audiosamples" but I think that "audio_samples" would work too, just no spaces. I dont know the rules for spaces but generally I try to avoid them now.
I have a folder with a single .mp3 that I wanna train my model from but when I copy the file path and click process I get some error about "\RVC/preprocess.log'" doesn't exist (which it doesn't. Should it?) Anyone know how to fix this?
Nothing happen when i click on "process data".....got these error in the consol "runtime\python.exe trainset_preprocess_pipeline_print.py F:\Voice AI\train 40000 8 F:\Voice AI\RVC-beta\RVC-beta0717/logs/rainzo False Traceback (most recent call last): File "F:\Voice AI\RVC-beta\RVC-beta0717\trainset_preprocess_pipeline_print.py", line 8, in sr = int(sys.argv[2]) ValueError: invalid literal for int() with base 10: 'AI\\train'
I used around 370 (less than 10 sec) clips of me singing and set epochs to 1000. Left it for the whole night and more and it's still "processing" (9+ hours). That's normal right? I guess I kinda overdid it with the amount of clips? I do have an rtx 3080 though. In the lower right corner it now says "processing | 46185/42.3" the left number's increasing quickly
Anyone has the same problem as mine at step2a? It hasn't appeared "end process". This happened to some of my audio datasets, but some didn't. I so have no clue what's actually wrong :(
This usually has to do with naming conventions, the size or volume of the audio, or the location of the files! Try and put the dataset folder in your RVC folder itself, or try and make sure they are longer then 3 seconds or so and shorter then 10, then make sure they are all atleast -6db type of volume!
@@p3tro wow thanks again. i have tried changing name (but still not sure what's their criteria), and already tried to change the location (but it seems like not my problem because I also had tried the dataset that has been worked and it still works, except the latest ones that i met the problem). About the length, I'm not sure, is the inconsistency (some are shorter than 10s, some are 10s, and some are longer than 10s or even some are about the 20s) of them affected? however; in the later tryings, i had already erased some that exceed 10s, and it's still not working, Does it really help?). But The size or volume of the audio, hmm, I never know I'm unconsciously avoiding thinking of it. Is AI not fully support belting or loud voice? Give big THANKS to you for giving me hopefully helpful ideas!!
@@qwerty0yt glad to help as much as I can!! These are just tips that helped me, I agree sometimes it still just randomly ignores some files or won’t end process 😭the ones above 10s are fine they will just get ignored! I know that when I used a limiter to change the volume of all my samples it accepted them and it fixed the preprocessing for me. But that isn’t always the case, just something that worked well for me ! It can be very finicky, like you said sometimes it will run just fine and other times it won’t! Even switching browsers by pasting the URL into another browser randomly helped me before 🙏 Hope you get it working!
Hey I had an error, that I think I fixed very stupidly. It would not start processing my audio samples. The error chat GPT couldnt help me with. What fixed it was renaming both the source folder that the whole program was in and the folder that my samples were in. Both names previously had spaces in them "AI Voice" etc. just left out the spaces and it worked without an issue
So my first training took 12+ hours. Changed it to 24 threads (I have a 5900x) and the next one was done in 5 hours lol. Man, my fans were going crazy for 12 hours that first time 💀💀💀
I fucked up this part by not mentioning it, but if you refresh your voice list you will see the pth file, but the actual shareable version will be in the "weights" folder! When i made my models I wasnt sharing them so i didn't realize there were seperate files!
@@locmonstr are you sure the training was successful? It should already appear in weights after your initial training - the refreshing voice list was so it would appear in the GUI as an option for available models
@@p3tro Definitely! I just mastered svc from your guides, and I have already heard about the advantages of rvc and am ready to study your new content using it. Thanx for your work!
Hi p3tro! Great stuff works for me as explained. I'm blown away. Thank you for this. Will you do a tutorial about the 'ckpt Processing'? I'd be very interested to hear what this would sound like. Tx
After I hit the "Train model" button, the "Output information" box displays "After the training is completed, you can view the console training log or the train.log" so there's no error and I think I've done everything correctly but after 10 hours, STILL no .BAT or .INDEX files. How long does this thing take?
same. Does it go "No eligible .pth files found" if you try to use any of them as a model? upd. at least in my case, the proper ".pth" is in the weights folder.
Tried on 3 different pc's and everytime i hit the process button im just getting these same type messages each time ' File "C:\RVC-beta-v2-0618\my_utils.py", line 14, in load_audio ffmpeg.input(file, threads=0) File "C:\RVC-beta-v2-0618 untime\lib\site-packages\ffmpeg\_run.py", line 325, in run raise Error('ffmpeg', out, err) ffmpeg._run.Error: ffmpeg error (see stderr output for detail)" Any idea bud ?
If it’s not working on all 3 pcs it’s definitely a problem with the installation, unless all 3 pcs somehow don’t have powerful enough GPUs. But the processing features part uses CPU anyway. Is it installed on your main hard drive? Sometimes even moving it to the very root of the drive instead of documents or other folders can be helpful. The threads = 0 seems to imply it set your CPU threads to 0 which would mean it isn’t using your cpu at all. Where it says “threads” try adjusting that number to 8
@@SomedayTomorrow369 hope you get it working! Let me know if your still getting the error I can try and think of a few more reasons why it could be happening!
hi! when I am training the dataset, on my pc pop up an blue screen and said "our device ran into a problem and needs to restart" . I have a geforce 3060 4gb + 16gb of ram so idk what is happening :(
What is the solution to the problem ? FileNotFoundError: [WinError 3] The system cannot find the path specified: 'G:\\RVC-beta\\RVC-beta-v2-0618/logs/myvoice/3_feature768
@Tech Crew Guy I have seen this issue be fixed by using “one click training” I believe, but one issue could be having it installed on a improper drive - like say your external hard drive as opposed to the same drive as your operating system. Also - check your weights folder, this is where your shareable model will be. The model in logs is for use with your Rvc only
whenever I hit "one click training " I see no error, it says process completed but I don't get .pth file, I only have log files. can you please help me regarding this issue?
man, thanks so much for this informative video!! just one question, why did you choose 40K as the sample rate over 48K?? I think 48K would deliver better quality sounding vocals?
If you DO have a path file but no index file, don't worry. A fork of RVC called 'Replay' makes it so that you don't need an index file, just the path file.
It should auto generate, but often times you have to "train feature index" to get it to spit out, which is right next to "train model". If you already made the model just make sure the name and settings are the same, and it will still generate the correct index file
"Hello. Today I created my first model thanks to this tutorial. I used a 5900x processor, an RTX 3060 12GB graphics card - as training material, I used 25 minutes of an interview, resulting in around 150 samples of about 10 seconds each. The model took about 7 hours to train with 500 epoch
When I train my model, I get two .pth files: one starting with D_ and the other starting with G_. It also generates two .index files, one starting with added_ and the other with trained_. Which ones do I use?
After I click ''Process Data' the 'Output Information'' block just flashes once and shows no data. When I check the log file, It only shows the export map with an empty 'preprocess.log' file and no other .wav files. Any ideas?
For those having problems with step 2a having a blank preprocess note or processing data with blank output information... I think you have to make sure that the training folder doesn't have spaces or probably even symbols... Hope this will help as it did for me.
Hi, I have tried to train a model followed your instruction. But when it finished, I can't find the .index file anywhere... Can you kindly help me out..?
Hey p3tro! Shout out to you btw, great tutorials. I've been working on a few A.I. songs just to listen to personally. How do you actually get the best quality vocal takes from the svc? Is it a certain way the vocals should be recorded? Should I upload the vocals with/without FX? Because I've found a bit of compression gets rid of those silent bits the A.I. generates for the quieter parts of the vocals. How would you go about getting the best quality out of it? Loaded questions, I know.
Always upload the vocals as clean acapellas! The volume matching feature in RVC can also be turned down for quieter vocals to more exactly follow the vocals, for instance it starts at 1 by default - turning it around .35 or .45 for quiet vocals will make it follow them more specifically! The style of vocal recording depends most on the style of the model, try to match what you want the model to sound like in your own delivery so the model has to do less work for you! For example, the best lil Uzi AI songs use lil Uzi covers as the input since the artist (Uzi Clone) sounds so much like Uzi already!
@@p3tro I m really need your help! If I have my own custom trained model but I want to add a new data to it to improve it,How can I add these data (New Audio Files) without the need to create a new model from scratch? PLEASE anser me because every time I want to add a new audio I go to mak a new model. Thanks In Advance!
How can i continue my pre-trained model later? I want to train it 20000 epochs for most realistic quality but i need to run day by day my same pre trained model..Is it possible on Easygui Colab? I have Pro colab
Maybe I'll get the answer later but, Why every tutorial I've seen are talking about Google Collabs struggling with GPU limits, when you can just do it locally as you did???
Hello! How can I set the RVC when I want to teach so that it doesn't use the GPU if not the CPU? Because it doesn't support my GPU, that's why I'm asking..thanks
Thank you, great tutorial! hard to find information about the local version of RVC. Do you need to hit the "Train feature index" button too in order to get the proper files created in the folders?
It’s supposed to be a feature of the train button, but the f.a.q reports it’s often necessary to train index feature regardless when the index file doesn’t appear! For me rather or not I need it is hit or miss
@@p3tro Thank you! Don't know if you can help with this. My dataset is 1 hour long. Once RVC starts training, the program crashes after 3 minutes. It even kicks me out of the windows desktop, and when I come back to the session, RVC tells me there was an error. My settings are 200 epochs, GPU batch size set to 4. I tested two other models today and the whole process worked fine... I never used more than 30 minutes of audio though, and didn't go higher than 50 epochs. Perhaps the amount of data is too much to begin with? Is one hour overkill?
@@JimBaker-ks4io one hour can work, it just has to be prepared properly, but to be honest your best bet is 10 minutes or so of the HIGHEST quality, anything even slightly low quality should be removed, it will work great with 10 minutes of high quality acapella. For me I aim for 500 epochs at 10 minute data sets
figured out to remove spaces on folders and fixed it.. now I have this error RuntimeError: Failed to load audio: ffmpeg error (see stderr output for detail) end preprocess can someone help?
is there anywhere I can find .wav files to see examples about how would be the ideal type of samples I should provide my AI so it learns a more wide register of my voice?
Theres a python program you can download called instant dataset maker. You load your audio files into the audio file of the download and then run the application and it will create 10 second clips of all the songs you put in the audio file and you just save the zip to wherever you want then extract it and point RVC at it in the training tab
Hey, thanks for the video. I ran into an issue where in step 2a where it says "Enter the path of the training folder:" no matter what file path I give it it will not process data. It cannot find the file path in the terminal every time.
This happened to me when I left writing in the example file path area! Make sure you leave the area blank where it is has the example file path if that makes sense
@@p3tro Thanks for the reply, I've been searching all day but I'm still not exactly sure what you mean. I also have been using chatgpt when I first started with this stuff, but it wants me to edit the code in the terminal which I am not allowed to edit anyway. I end up with a "FileNotFoundError: [Errno 2] No such file or directory:" I simply can't figure out how to upload my .wav and get it to actually work.
@@RoyalFlush007 there’s a way to type your file path for your wav in, copy and pasting the address instead of actually uploading the file should fix it if that’s the only error
@@p3tro I have been copying as path and putting that result in the box under "path to training folder." Is there actually a place to upload the file as well?
@@matttdk4888 if your not getting a index file - try “train feature index” - if your not getting a .pth file in logs, check your weights folder. “Refresh voice list” to see if it appears in your voice list, if not you did something wrong. If so, the file should be in weights. The whole things very finicky - could be a failure during training due to a variety of reasons - the location of your RVC files, the location of your dataset, the name of the dataset having spaces can even mess it up sometimes, the dataset itself (thought it’s not likely that), your thread count being set incorrectly, batch size being set to high for your gpu - though most of these things should give back an error in the command line to hell you trouble shoot. If there are no errors, try “One click training” instead of doing individual training, if you were doing “One Click Training”, then try doing individual steps. If neither are working, and the training itself isn’t giving any errors, it could be a problem with the initial set up itself. Good things to let people know to help people troubleshoot include: Your operating system, your cpu and your gpu, the last action before your issue occurred, any errors from the command line,
Super curious to find out how well this works with speaking clips instead of singing ones. I know most use cases for that are probably not good ones, but I have a custom edited Azure voice I use for a AI that I want to train this to sing like. I’m gonna try it but anyone with advice on this would be greatly appreciated.
You use edison, and highlight the noise you want to remove - then you click the toothbrush and click "detect noise" - then you can select all of it and click the tooth brush again and "denoise". I cover it in this video as well: th-cam.com/video/VaDhHIDOpPY/w-d-xo.html
What gpu are you using? There should be a setting for gpu id you can experiment changing it! If you don’t have a nvidia gpu it is most likely not supported!
Hi p3tro! I get an error when I try to train my own model. It says: "torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 2.00 GiB total capacity; 1.63 GiB already allocated; 0 bytes free; 1.71 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF". In the FAQ it said that for this problem I need to lower the batch size. I already set it to 1 in the training interface, and a config .json file inside my models folder, but it still shows me the error. My computer has 8 GB ram, GPU: Nvidia GeForce MX230, CPU: Inter Core i5-8265U.
"RuntimeError: The expanded size of the tensor (12800) must match the existing size (0) at non-singleton dimension 1. Target sizes: [1, 12800]. Tensor sizes: [0]" - such mistake on last step...
I dunno if you can add subtitles or notes to a video that already exists but...
KimVocal2 + VR DeReverb + VR DeEcho Aggressive + VR 5HP Karaoke work amazingly to isolate the main vocals.
I can't seem to use ensemble to do all 4 simultaneously to get one output (I can do 2 at a time, and I get multiple outputs rather than an output that used all 4 of the above effects)
A good tip I don't think you mentioned that is awesome is that you can load as many audio files as you want, so if you're taking a voice from an album you can load the entire album in and apply kim vocal 2 to all songs, then apply dereverb to all the kimvocaloutputs, etc.
(I'm using UVR5 separately, it doesn't seem to work in my RVC gradio)
best i can do is pin this comment!!
Hey, would you mind sharing what aggressiveness did you find to work the best on the VR models?
@@bogobogobogo542 it's not one size fits all apparently. To streamline it do like 10 songs at a time for testing
Speed info:
3090 24gb - batch size 36
allocates 22gb - around 17 seconds per epoch
hey sorry for replying 4 months later, but do you perhaps remember how many sound files you were using when you did this? I used around 370 (less than 10 sec) clips of me singing and set epochs to 1000. Left it for the whole night and more and it's still "processing" (9+ hours). That's normal right?
@@lostdreamer50 im late but generally you dont need much above 200 epocs, also the console window should show which epoch its processing on currently
There is no .pth file after training. I see there are index files only, no ckpt either
when i press train model, no .pth file show up. do you know what i might be doing wrong or where they could have exported?
Bro....any solution,have same problem
no one's answered so far
@@magmad1ver yea bro I completely gave up on ai voice models stuff a while ago. To much bs to work with and no one giving answers on how to make it work🤷🏻♂️
@@slevpy the .pth is in the weights folder
Even after completion of all processes, i cant see any .pth file for vocal i trained...i can see only index file..help me with that
same issue here
Fantastic tutorial, straight to the point with things. Now I'm just waiting for the training to end. 15 minutes of cleaned up voice, 500 epochs been 24+ hours 😅😅
Massive EDIT (If you are facing some issues like thinking the training is taking too long/stuck/suspended)
At first I thought the training just needed a long ass time. I left it to "work" for over 48 hours, only then did I check the command line, where it clearly stated that it was still at epoch 13, after this long, the program must have bugged out or something.
What I am doing now is saving a checkpoint every 5 epochs instead of 50. Since my first attempt did not even reach 50, it did not save, and I had to restart my model from scratch.
The point of this post is that YOU CAN RESUME training a model anytime you choose, as long as you have those checkpoints. (I did not know this, which made me hesitate to turn off my PC or check the command line before)
i.e. I want to train my model for 300 epochs. If it bugs out / stops for whatever reason at epoch 273. Stop the program, and restart it, using the same parameters you input when first training the model. It will load up the checkpoint at epoch 270, since I am saving a checkpoint every 5 epochs, so the last checkpoint would be at 270.
I noticed it would bug if I use too many threads, and a large batch file size.
For my specs 9 threads, and a batch size of 5 is optimal. 1 epoch would take around 4.5 minutes
Trained a model for 300 epochs using around 50 minutes of sliced up audio. It took around 23 hours.
Specs for those who are curious:
i7-8700
GTX 1080
Using 8 threads to process
A little late, but thank you for your service sir o7. Us low-spec users thank you.
A note for everyone using this and getting an error, make sure there are NO SPACES in the names of your speaker, folder, etc! I don't know why it worked for p3tro and not for me, but this stumped me for a sec
Can you make a video on how to continue training the model. Covering things like adding more audio files to the dataset, and running more epochs on the model. Thanks!!
Currently in the process of training my first model, thanks to you. Thanks P3tro
for those of you having trouble, he skipped over a bunch of important shit and got a some things wrong. First off, you can't load those pth files (he shows that it generates in that folder) in a voice changer client, you have to load the one in the "weights folder.." you'll notice it's actually named what you named the experiment... second, when you click "train model," that doesn't generate the index files. You can either click "One-click training", or train it, then click "train feature index." Some more tips, I've had great quality using 700-1000 epochs and 10 minutes of clean audio separated into 10 second chunks. Don't use version 1 like he does, use version 2. Using more than 10 minutes of audio seems to slow the training by a lot and doesn't help with quality much. You can't use too many epochs though, it just won't get any better past a certain point. RTX 4070 12 gigs Vram took about an hour and a half to do 700 epochs.
thanks for the tips man
@@thegtlab np
I can't seem to get the index file even doing your method.
@@ThyronixHi so after training we use the path file in weighs folder? And for index you said to click train index to get the index file? Where would this file be located once completed ?
@@PACKAGEMODS the model once completed saves to the weights folder, the index file saves in logs/project name
now that google is cracking down on rvc this is a good alternative that they CANT take from us.
Thank you so much for this video! I spent days researching on TH-cam on figuring out how to generate and train AI voices because I wanted to create my custom made voice, but this one had the most clear and straightforward instructions that finally got my project to work. I almost gave up until I saw this video ngl lol
Hi P3tro... I was wondering if it is possible for us to train a model, then stop... then continue the next day... I was aiming for 1000 or more epochs. I guess my question is, does training have to be done in one go? Or can you stop and continue? Thanks P3tro!
"Hello. Today I created my first model thanks to this tutorial. I used a 5900x processor, an RTX 3060 12GB graphics card - as training material, I used 25 minutes of an interview, resulting in around 150 samples of about 10 seconds each. The model took about 7 hours to train with 500 epochs. Unfortunately, I didn't notice an option to stop the training anywhere."
Hey guys, I'm running Step2a of training in the RVC webpage under "train" and I keep getting the error "FileNotFoundError: [Errno 2] No such file or directory: 'C:\\RVC-beta\\RVC-beta0717/logs/kara/preprocess.log'" - I copied and pasted the path of my vocal sample (folder is called "KaraVoice") and this error keeps coming up. I move the folder and copy the new path and the error still comes up. Do I need to do some sort of preprocessing?
Could anyone out there help me please? When I click on "Train Model", this message appears: 训练结束, 您可查看控制台训练日志或实验文件夹下的train.log
(Translation: After training is complete, you will be able to view the console training log or train.log in the experiment folder) Does this mean it is training the model? Because nothing appears in the console.
u know what... best YTchannel ever!
Is there a way to do this on CPU-only (AMD device)? I tried setting the GPU argument to blank, but it's just frozen in the console on the python runtime arg
I've been watching your tutorials, you're a god! Thanks a lot.
Glad I could help!! :)
with the dataset, whats the rule on the clips, and the ten seconds, is it strictly 10 seconds, should i go 8 seconds if the last two seconds are scream where the first 8 were sung.. should i aim to divide the different vocal tones or ways of singing the singer can/does sing clip to clip or should i just cut a studio vox into 10 sec clips til the song is over..
Looks like it's kinda answered for me, UP TO ten seconds, but still keen to know if there is a benefit to cutting and sorting the vocals based on tone, sing, scream, that kinda thing..
dk why but it keeps saying my nvidia drivers are old, even though i updated everything 20 mins ago... help
i didnt get any index files and i got two weirdly named pth, D_2333333 and G_2333333
When I go to logs folder, I dont have all of those audio files, I just get a preprocess.log file, even though I got all "success" in the command prompt and "end preprocess" lol.
i got the same problem, did you find a solution?
@@matttdk4888 Yes, one of my files; for example "audio samples" had a space between the words, so I just changed it to like "audiosamples" but I think that "audio_samples" would work too, just no spaces. I dont know the rules for spaces but generally I try to avoid them now.
I have a folder with a single .mp3 that I wanna train my model from but when I copy the file path and click process I get some error about "\RVC/preprocess.log'" doesn't exist (which it doesn't. Should it?) Anyone know how to fix this?
I get the same error no still haven't found a fix
@@HOTTAZHELLENT Did either of you find a fix? I have the same issue.
Nothing happen when i click on "process data".....got these error in the consol "runtime\python.exe trainset_preprocess_pipeline_print.py F:\Voice AI\train 40000 8 F:\Voice AI\RVC-beta\RVC-beta0717/logs/rainzo False
Traceback (most recent call last):
File "F:\Voice AI\RVC-beta\RVC-beta0717\trainset_preprocess_pipeline_print.py", line 8, in
sr = int(sys.argv[2])
ValueError: invalid literal for int() with base 10: 'AI\\train'
could you solve it?
I used around 370 (less than 10 sec) clips of me singing and set epochs to 1000. Left it for the whole night and more and it's still "processing" (9+ hours). That's normal right? I guess I kinda overdid it with the amount of clips? I do have an rtx 3080 though. In the lower right corner it now says "processing | 46185/42.3" the left number's increasing quickly
Hi! How long did it take you to finish the training? Mine is currently in the same situation, it's taking so damn long.
@@hungvominh2490 hey can't remember exactly but it took the whole day or so. I may have overtrained it. Must try again with tensor
To get copy as path on windows 10, you have to hold shift for the menu to come up
For some reason I get a filenotfounderror [Errno 2] No such file or directory.
help! It says Errno2, no such file or directory! what do I do?
Anyone has the same problem as mine at step2a? It hasn't appeared "end process".
This happened to some of my audio datasets, but some didn't. I so have no clue what's actually wrong :(
This usually has to do with naming conventions, the size or volume of the audio, or the location of the files! Try and put the dataset folder in your RVC folder itself, or try and make sure they are longer then 3 seconds or so and shorter then 10, then make sure they are all atleast -6db type of volume!
@@p3tro wow thanks again. i have tried changing name (but still not sure what's their criteria), and already tried to change the location (but it seems like not my problem because I also had tried the dataset that has been worked and it still works, except the latest ones that i met the problem). About the length, I'm not sure, is the inconsistency (some are shorter than 10s, some are 10s, and some are longer than 10s or even some are about the 20s) of them affected? however; in the later tryings, i had already erased some that exceed 10s, and it's still not working, Does it really help?). But The size or volume of the audio, hmm, I never know I'm unconsciously avoiding thinking of it. Is AI not fully support belting or loud voice? Give big THANKS to you for giving me hopefully helpful ideas!!
@@qwerty0yt glad to help as much as I can!! These are just tips that helped me, I agree sometimes it still just randomly ignores some files or won’t end process 😭the ones above 10s are fine they will just get ignored! I know that when I used a limiter to change the volume of all my samples it accepted them and it fixed the preprocessing for me. But that isn’t always the case, just something that worked well for me ! It can be very finicky, like you said sometimes it will run just fine and other times it won’t! Even switching browsers by pasting the URL into another browser randomly helped me before 🙏 Hope you get it working!
It says my gpu is not supported and I can't use feature extraction
my gpu wont work because it's amd, what do i do about amd?
Hello. Pth is formed during training but the voice is like the original??
Hey I had an error, that I think I fixed very stupidly. It would not start processing my audio samples. The error chat GPT couldnt help me with. What fixed it was renaming both the source folder that the whole program was in and the folder that my samples were in. Both names previously had spaces in them "AI Voice" etc. just left out the spaces and it worked without an issue
Dude thank you! This fixed the errors I was getting during processing too.
So my first training took 12+ hours. Changed it to 24 threads (I have a 5900x) and the next one was done in 5 hours lol. Man, my fans were going crazy for 12 hours that first time 💀💀💀
What gpu you have ?
what do you mean you changed the threads? how do i change them?
I cannot find the .pth file. It just doesn't give it to me
I fucked up this part by not mentioning it, but if you refresh your voice list you will see the pth file, but the actual shareable version will be in the "weights" folder! When i made my models I wasnt sharing them so i didn't realize there were seperate files!
@@p3tro I refreshed my list and the path file still doesn't spawn
@@locmonstr are you sure the training was successful? It should already appear in weights after your initial training - the refreshing voice list was so it would appear in the GUI as an option for available models
@@p3tro Yeah the training says it was completely successful
@@locmonstr did you use one click training or just regular training? And are the D_ and G_ files and index files in the log folder?
This guy chill asf damn. Thanks for the tutorial man!
Только что подумал о том, что неплохо было бы, если петро запишет гайд по rvc, и тут же видео выходит. Магия.
Я, должно быть, чувствовал твои мысли отсюда! :)
@@p3tro Definitely! I just mastered svc from your guides, and I have already heard about the advantages of rvc and am ready to study your new content using it. Thanx for your work!
Hi p3tro! Great stuff works for me as explained. I'm blown away. Thank you for this. Will you do a tutorial about the 'ckpt Processing'? I'd be very interested to hear what this would sound like. Tx
theres a realtime gui bat and I'm curious if youll ever cover that?
After I hit the "Train model" button, the "Output information" box displays "After the training is completed, you can view the console training log or the train.log" so there's no error and I think I've done everything correctly but after 10 hours, STILL no .BAT or .INDEX files. How long does this thing take?
is it mandatory to have a gpu for training?
What should i do when it says "connection errored out"?
7:04 I have 2 .pth files, "G_2333333" and "D_2333333"
same. Does it go "No eligible .pth files found" if you try to use any of them as a model?
upd. at least in my case, the proper ".pth" is in the weights folder.
@@Tyulenin Yes, I also found the right file 👍
well where is it?@@LeGnocchi
Tried on 3 different pc's and everytime i hit the process button im just getting these same type messages each time
' File "C:\RVC-beta-v2-0618\my_utils.py", line 14, in load_audio
ffmpeg.input(file, threads=0)
File "C:\RVC-beta-v2-0618
untime\lib\site-packages\ffmpeg\_run.py", line 325, in run
raise Error('ffmpeg', out, err)
ffmpeg._run.Error: ffmpeg error (see stderr output for detail)" Any idea bud ?
If it’s not working on all 3 pcs it’s definitely a problem with the installation, unless all 3 pcs somehow don’t have powerful enough GPUs. But the processing features part uses CPU anyway. Is it installed on your main hard drive? Sometimes even moving it to the very root of the drive instead of documents or other folders can be helpful.
The threads = 0 seems to imply it set your CPU threads to 0 which would mean it isn’t using your cpu at all. Where it says “threads” try adjusting that number to 8
@@p3tro thank you for replying , ill try them keep up the good work ))
@@SomedayTomorrow369 hope you get it working! Let me know if your still getting the error I can try and think of a few more reasons why it could be happening!
hi! when I am training the dataset, on my pc pop up an blue screen and said "our device ran into a problem and needs to restart" . I have a geforce 3060 4gb + 16gb of ram so idk what is happening :(
hi ive been finding a problem for the past 2 days now i dont know why but everything pops up accept from my pth file in the logs
ValueError: invalid literal for int() with base 10: 'Video' Help Me
What is the solution to the problem ? FileNotFoundError: [WinError 3] The system cannot find the path specified: 'G:\\RVC-beta\\RVC-beta-v2-0618/logs/myvoice/3_feature768
Great video! Where can I find the index file? For me it is not saved in the log folder.
Ah you also need to click on train feature index...my bad 😆
@@johnbob70 nah that’s my bad I totally forgot to mention this 😭🤣
@@p3tro I'm having the opposite problem: It's creating an .index file for me but no .pth file. Any ideas?
@Tech Crew Guy I have seen this issue be fixed by using “one click training” I believe, but one issue could be having it installed on a improper drive - like say your external hard drive as opposed to the same drive as your operating system. Also - check your weights folder, this is where your shareable model will be. The model in logs is for use with your Rvc only
@@p3tro Dang... for whatever reason my weights folder is empty even after everything finishes
For some reason I get a filenotfounderror [Errno 2] No such file or directory.
Any one any idea how to fix this?
Ran a model, trained it for 500 epochs with no errors. .pth files created but no Index files. Any reason why?
dude i am getting file not found error when processing output after pasting my dataset path.. Help please!!!
whenever I hit "one click training " I see no error, it says process completed but I don't get .pth file, I only have log files. can you please help me regarding this issue?
man, thanks so much for this informative video!! just one question, why did you choose 40K as the sample rate over 48K?? I think 48K would deliver better quality sounding vocals?
48k sometimes has problems processing when I do feature extraction, atleast in my experience!
@@p3tro what exactly is feature extraction? btw I would like to run an experiment and try with 48K just to try to get better sound quality...
If you DO have a path file but no index file, don't worry. A fork of RVC called 'Replay' makes it so that you don't need an index file, just the path file.
THank you! i was very worried and about to restart the entire process
Running into this error on step 2a - "ValueError: invalid literal for int() with base 10: 'set'." Can anyone help?
when i train my own models, i dont get any index files, just pth files. anyone know why
It should auto generate, but often times you have to "train feature index" to get it to spit out, which is right next to "train model". If you already made the model just make sure the name and settings are the same, and it will still generate the correct index file
Thx I had the same issue and was very disappointed when I didn't see them. @@p3tro
Whens last step running ( making epoch) my display turned off.. But pc is running. Numeric and caps locks a working.. Is that normal or what?
your display went to sleep lmao
has that never happened to u before
I have no clue where the files could have been spat into. you uh.. did not show us the file location
10+ hours training, 124 files from 5 to 10 sec. Processing 55200/139.7 . Im close to the ending? haha . Thanks for the tutorial bro
"Hello. Today I created my first model thanks to this tutorial. I used a 5900x processor, an RTX 3060 12GB graphics card - as training material, I used 25 minutes of an interview, resulting in around 150 samples of about 10 seconds each. The model took about 7 hours to train with 500 epoch
When I train my model, I get two .pth files: one starting with D_ and the other starting with G_. It also generates two .index files, one starting with added_ and the other with trained_. Which ones do I use?
i had the same thing, but i found the right pth file in the weights folder and then my rvc worked
Suppose if i am done creating the model, can i make use of that model for becoming a TTS Voice pack ?
Mine has an issue with locating the file at step 1, Resulting in the file not found error. Any Help?
After I click ''Process Data' the 'Output Information'' block just flashes once and shows no data. When I check the log file, It only shows the export map with an empty 'preprocess.log' file and no other .wav files. Any ideas?
Update: moved RVC directly to C drive (had it in progam files). Now it works.
For those having problems with step 2a having a blank preprocess note or processing data with blank output information... I think you have to make sure that the training folder doesn't have spaces or probably even symbols... Hope this will help as it did for me.
Yo can I dm you bro?
Hi, I have tried to train a model followed your instruction. But when it finished, I can't find the .index file anywhere... Can you kindly help me out..?
Click „train index” button
@@sebastianszwarc4162 It doesn't work for me
Hey p3tro! Shout out to you btw, great tutorials. I've been working on a few A.I. songs just to listen to personally. How do you actually get the best quality vocal takes from the svc? Is it a certain way the vocals should be recorded? Should I upload the vocals with/without FX? Because I've found a bit of compression gets rid of those silent bits the A.I. generates for the quieter parts of the vocals. How would you go about getting the best quality out of it? Loaded questions, I know.
Always upload the vocals as clean acapellas! The volume matching feature in RVC can also be turned down for quieter vocals to more exactly follow the vocals, for instance it starts at 1 by default - turning it around .35 or .45 for quiet vocals will make it follow them more specifically!
The style of vocal recording depends most on the style of the model, try to match what you want the model to sound like in your own delivery so the model has to do less work for you!
For example, the best lil Uzi AI songs use lil Uzi covers as the input since the artist (Uzi Clone) sounds so much like Uzi already!
@@p3tro I m really need your help! If I have my own custom trained model but I want to add a new data to it to improve it,How can I add these data (New Audio Files) without the need to create a new model from scratch? PLEASE anser me because every time I want to add a new audio I go to mak a new model. Thanks In Advance!
@@hanynagy8969 Did you get the answer?
How can i continue my pre-trained model later? I want to train it 20000 epochs for most realistic quality but i need to run day by day my same pre trained model..Is it possible on Easygui Colab? I have Pro colab
Maybe I'll get the answer later but,
Why every tutorial I've seen are talking about Google Collabs struggling with GPU limits, when you can just do it locally as you did???
Hi, how are U?
Nice video.
Where can I find the Download of training program?
Hello! How can I set the RVC when I want to teach so that it doesn't use the GPU if not the CPU? Because it doesn't support my GPU, that's why I'm asking..thanks
"No eligible .pth files found". Am I dumb?
upd. at least in my case, the proper ".pth" is in the weights folder.
Thank you, great tutorial! hard to find information about the local version of RVC. Do you need to hit the "Train feature index" button too in order to get the proper files created in the folders?
It’s supposed to be a feature of the train button, but the f.a.q reports it’s often necessary to train index feature regardless when the index file doesn’t appear! For me rather or not I need it is hit or miss
@@p3tro Okay thank you for all the knowledge you're making available! In case I have to click it, is it a long process too?
@@JimBaker-ks4io Not at all!
@@p3tro Thank you! Don't know if you can help with this. My dataset is 1 hour long. Once RVC starts training, the program crashes after 3 minutes. It even kicks me out of the windows desktop, and when I come back to the session, RVC tells me there was an error.
My settings are 200 epochs, GPU batch size set to 4.
I tested two other models today and the whole process worked fine... I never used more than 30 minutes of audio though, and didn't go higher than 50 epochs.
Perhaps the amount of data is too much to begin with? Is one hour overkill?
@@JimBaker-ks4io one hour can work, it just has to be prepared properly, but to be honest your best bet is 10 minutes or so of the HIGHEST quality, anything even slightly low quality should be removed, it will work great with 10 minutes of high quality acapella. For me I aim for 500 epochs at 10 minute data sets
Is it really necessary to cut it in 10s clip ? I thought the preprocessing was taking care of it.
Watched many videos but clipping video into 10second file and low batch side fixed my error on GTX 1650 ti (2 batch size )
getting this error on Training
ValueError: invalid literal for int() with base 10:
what is this?
figured out to remove spaces on folders and fixed it.. now I have this error
RuntimeError: Failed to load audio: ffmpeg error (see stderr output for detail)
end preprocess
can someone help?
Hey great vid thx , Should audio clips be in mp3 or wav ? Or it dont matter ? thx
problem, there is nothing when I click on process data at the first and second steps
is there anywhere I can find .wav files to see examples about how would be the ideal type of samples I should provide my AI so it learns a more wide register of my voice?
Theres a python program you can download called instant dataset maker. You load your audio files into the audio file of the download and then run the application and it will create 10 second clips of all the songs you put in the audio file and you just save the zip to wherever you want then extract it and point RVC at it in the training tab
Why do you need 10 sec? Why not 1-2 hr single file?
Hey, thanks for the video. I ran into an issue where in step 2a where it says "Enter the path of the training folder:" no matter what file path I give it it will not process data. It cannot find the file path in the terminal every time.
This happened to me when I left writing in the example file path area! Make sure you leave the area blank where it is has the example file path if that makes sense
@@p3tro Thanks for the reply, I've been searching all day but I'm still not exactly sure what you mean. I also have been using chatgpt when I first started with this stuff, but it wants me to edit the code in the terminal which I am not allowed to edit anyway. I end up with a "FileNotFoundError: [Errno 2] No such file or directory:" I simply can't figure out how to upload my .wav and get it to actually work.
@@RoyalFlush007 there’s a way to type your file path for your wav in, copy and pasting the address instead of actually uploading the file should fix it if that’s the only error
@@p3tro I have been copying as path and putting that result in the box under "path to training folder." Is there actually a place to upload the file as well?
is there an easy way of mixing vocal & instrumental back together? Doing it in audacity is a bit of a faff. Something more automated would be good.
I do not get all the required audio files after training model, and 0 people are telling me what to do. plz help
What files are you missing?
@@p3tro i do not have the .pth file and i only have 2 events.out.fevents. files instead of 4
@@matttdk4888 if your not getting a index file - try “train feature index” - if your not getting a .pth file in logs, check your weights folder. “Refresh voice list” to see if it appears in your voice list, if not you did something wrong. If so, the file should be in weights. The whole things very finicky - could be a failure during training due to a variety of reasons - the location of your RVC files, the location of your dataset, the name of the dataset having spaces can even mess it up sometimes, the dataset itself (thought it’s not likely that), your thread count being set incorrectly, batch size being set to high for your gpu - though most of these things should give back an error in the command line to hell you trouble shoot.
If there are no errors, try “One click training” instead of doing individual training, if you were doing “One Click Training”, then try doing individual steps. If neither are working, and the training itself isn’t giving any errors, it could be a problem with the initial set up itself.
Good things to let people know to help people troubleshoot include: Your operating system, your cpu and your gpu, the last action before your issue occurred, any errors from the command line,
can I use RVC trained models in Tortoise TTS and vice versa?
Does the dataset audio actually have to be divided into 80-100 clips or can the audio be one long, continuous track?
Segmented clips are better.. they'll save up some memory..
Get ready to learn Chinese buddy
Super curious to find out how well this works with speaking clips instead of singing ones.
I know most use cases for that are probably not good ones, but I have a custom edited Azure voice I use for a AI that I want to train this to sing like.
I’m gonna try it but anyone with advice on this would be greatly appreciated.
note sounded awful with 300 clips. i fed it diffrent tones, pitches, and settings and it didnt work well. i'll rety with a lot more but so far no dice
few months later. training data was the issue. voice is actually awesome now. however I ditched AZURE. Azure sucks.
When loading my .pth files i get an error. what could cause that?
Do the dataset clips need to be cut into 10s or can they be just one long clip?
2:35 pretty good considering the daft punk effect already on the vox, they're trained on voices usually
So I completed the training but didn't end up with any index files, well that's just my luck lmao
How do you detect and delete noise in FL Studio? I only see an option to denoise a noise sample. Thanks!
You use edison, and highlight the noise you want to remove - then you click the toothbrush and click "detect noise" - then you can select all of it and click the tooth brush again and "denoise". I cover it in this video as well: th-cam.com/video/VaDhHIDOpPY/w-d-xo.html
after training the model the path file is not showing
It says "Unfortunately, there is no compatible GPU available to support your training." No idea how to fix this issue :(
What gpu are you using? There should be a setting for gpu id you can experiment changing it! If you don’t have a nvidia gpu it is most likely not supported!
@p3tro oh, probably that's the reason ☹️ I use intel gpu. Do you mind if I ask you if there is any way to train rvc model using Intel? 🤔
Am I able to continue training an already trained voice?
Do you have to use clips of singing only or can you use clips of the person speaking also when making a model?
speaking works just as well
is possable to run this on mac? I have silicon mac so I can't bootcamp
Hi p3tro! I get an error when I try to train my own model. It says:
"torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 2.00 GiB total capacity; 1.63 GiB already allocated; 0 bytes free; 1.71 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF".
In the FAQ it said that for this problem I need to lower the batch size. I already set it to 1 in the training interface, and a config .json file inside my models folder, but it still shows me the error.
My computer has 8 GB ram,
GPU: Nvidia GeForce MX230,
CPU: Inter Core i5-8265U.
same here, did you manage to fix it?
@@dylandauphin4974 Nope, figured that my computer is not capable/powerful enough, to do local training so I switched to an online version of RVC.
"RuntimeError: The expanded size of the tensor (12800) must match the existing size (0) at non-singleton dimension 1. Target sizes: [1, 12800]. Tensor sizes: [0]" - such mistake on last step...
Can I do the training on a laptop with AMD processor and gpu?
Do you find a way ? 'coz I can't figure it out with AMD GPU, it seems exclusive to NVIDIA