How to Fine-tune XTTS

แชร์
ฝัง
  • เผยแพร่เมื่อ 3 ต.ค. 2024

ความคิดเห็น • 84

  • @NickMak-m2c
    @NickMak-m2c 18 วันที่ผ่านมา +3

    Does the colab notebook no longer work? I get this error in the first cell block:
    ERROR: pip's legacy dependency resolver does not consider dependency conflicts when selecting packages. This behaviour is the source of the following dependency conflicts.
    albucore 0.0.14 requires numpy>=1.24, but you'll have numpy 1.22.0 which is incompatible.

  • @corpse2222
    @corpse2222 6 หลายเดือนก่อน +7

    Not sure where or how to properly submit a ticket for the colab creators, but, the colab has been broken for weeks now. I check every couple days and it's only getting worse, stopping with errors sooner and sooner in the process.

    • @drewthomasson949
      @drewthomasson949 2 หลายเดือนก่อน +1

      I’m currently working on a fixed colab version, I’ll give you a heads up if I get it working
      Seems that we need to force downgrade all the packages that were changed in recent google colab updates

    • @drewthomasson949
      @drewthomasson949 2 หลายเดือนก่อน

      Fixed
      colab.research.google.com/drive/1sqQqzupo2pdjgggkrbM60sU6sBFYo3su?usp=sharing

    • @corpse2222
      @corpse2222 2 หลายเดือนก่อน

      @@drewthomasson949 Sweet! Glad to hear it. And, thanks in advance for all the hard work! :)

    • @drewthomasson949
      @drewthomasson949 2 หลายเดือนก่อน

      Fixed
      colab.research.google.com/drive/1sqQqzupo2pdjgggkrbM60sU6sBFYo3su?usp=sharing

    • @itsanemu-e1d
      @itsanemu-e1d หลายเดือนก่อน

      @@drewthomasson949 Can you please share the colab? I can't run the second cell, it gives a bunch of errors
      "ImportError: tokenizers>=0.19,

  • @theentirecircus6623
    @theentirecircus6623 8 หลายเดือนก่อน +2

    Great video, once saved the model, can we make inference locally using the tts module?

  • @filip1998220
    @filip1998220 9 หลายเดือนก่อน +2

    Is there an example of a narration script that produces the best cloning results? Perhaps one that includes all the phonemes?

  • @michal5869
    @michal5869 10 หลายเดือนก่อน +4

    is there are any options for fine-tuning much longer eg: a few hours for better results?

  • @aurelianobuendia24
    @aurelianobuendia24 7 หลายเดือนก่อน +2

    it would be amazing to know how to load the model in a local enviroment now that i´ve train it

    • @sayedyasser2
      @sayedyasser2 7 หลายเดือนก่อน +1

      any idea on this? Ive got the fine tuned model and the json files, where do I load it for future use?

    • @torusx8564
      @torusx8564 7 หลายเดือนก่อน

      Download the model. You can transfer into your google drive lol. If you could not this video would have 0 sense@@sayedyasser2

  • @maker_pt
    @maker_pt 10 หลายเดือนก่อน +4

    It seems to work really nicely. But how can I run the model in coqui? E.g. using the python api or the tts server?

    • @YevgeniyChannel
      @YevgeniyChannel 9 หลายเดือนก่อน

      Me too.

    • @danemmer9686
      @danemmer9686 8 หลายเดือนก่อน

      for api, replace the files in your computer with the files you've downloaded@@YevgeniyChannel

  • @aigaming6310
    @aigaming6310 7 หลายเดือนก่อน

    Great notebook & video !
    Unfortunate, it seems notebook gradio not support Thai yet.
    Could you please also guide me how to do it in another language ?

  • @GS195
    @GS195 9 หลายเดือนก่อน +1

    I want Prompt To Voice please.
    Do you know how hard it is to find a voice according to my specifications?

  • @AseelAl-khalaf
    @AseelAl-khalaf 7 หลายเดือนก่อน

    I fine tuned the model and saved it in my device . but in every time I want to use the model I should provide the speaker_wav which is from data used for fine tuning and this process(analyses the record) take a long time. so, how can I use the model with my own speaker id to avoid providing it the speaker wav ???

  • @pylotlight
    @pylotlight 6 หลายเดือนก่อน +1

    Got kicked out by google due to free tier limits..

  • @jakobejensen6765
    @jakobejensen6765 8 หลายเดือนก่อน +2

    This might sound like a dumb question, but how would you load the fine tuned model in other python programs. I know we get a config.json, vocab.json and a modal.pth files after the fine tuning process, but would we use the TTS.api?

  • @krisKrag
    @krisKrag หลายเดือนก่อน

    company shutdown?

  • @RenataGillum-o4h
    @RenataGillum-o4h 18 วันที่ผ่านมา

    Kessler Mountain

  • @yklandares
    @yklandares 9 หลายเดือนก่อน

    Guys, I repeated your lesson, but in the files of all CMS models wherever I see, there is an index file / how to create it or is it not needed???))))

    • @torusx8564
      @torusx8564 7 หลายเดือนก่อน +1

      ? wdym

  • @handcraft.corner
    @handcraft.corner 8 หลายเดือนก่อน +2

    Is this Fine Tuning for XTTS v1 or v2?

  • @NickMak-m2c
    @NickMak-m2c 9 หลายเดือนก่อน

    Also, it doesn't seem to run locally after downloading the model.pth, vocab.json and config.json
    Do you need to download the whisper model for it to work locally or is that just for training?
    Edit: No the whisper model didn't change it, I was desperate, figured maybe it needed to check you had requirements for training in order to inferencing or something, that that didn't do it. Removing the quotation makes from the paths made it look like it was loading for a moment, but then after 4 seconds it just says "error." When loading the finetuned model.

    • @joeyhandles
      @joeyhandles 9 หลายเดือนก่อน +1

      prob just drop those files over the xtts model you have installed locally.

  • @DonaldStepanski-b5b
    @DonaldStepanski-b5b 10 วันที่ผ่านมา

    Simonis Streets

  • @starbuck1002
    @starbuck1002 8 หลายเดือนก่อน +1

    Shoutout Two Minute Papers i guess :D

  • @user-ng4fk5hd6m
    @user-ng4fk5hd6m 10 หลายเดือนก่อน

    I gave up on it, i tried to train it on a 2:30 duration audio that was cleaned properly and it just was still training after 20 minutes on the default settings

    • @erogol
      @erogol 10 หลายเดือนก่อน +2

      fine-tuning takes time. You need to wait for a bit.

    • @torusx8564
      @torusx8564 7 หลายเดือนก่อน

      depends it takes around 5 min for with 10min audio for fine tuning. Just make sure you use T4 GPU@@erogol

    • @pylotlight
      @pylotlight 6 หลายเดือนก่อน

      @@torusx8564 got kicked out by google due to some error about free limits sadly.

  • @EllenaBuden-j8b
    @EllenaBuden-j8b 23 วันที่ผ่านมา

    Bethany Mill

  • @CarpenterElton-t1g
    @CarpenterElton-t1g 27 วันที่ผ่านมา

    Tess Street

  • @captainlavenderVHS
    @captainlavenderVHS 10 หลายเดือนก่อน

    Very very cool!!!! Doesn't work when dataset language is set to ja on 1st tab though - doesn't seem to be able to populate the metadata_eval.csv

    • @Gobolinn
      @Gobolinn 10 หลายเดือนก่อน

      encountered the same issue looks like ja isnt supported yet

    • @Otome_chan311
      @Otome_chan311 8 หลายเดือนก่อน

      @@Gobolinn disappointing. i've been looking for a good ja->en voice clone tts. best I've found so far is moegoe which ends up being a bit weird sounding with the pacing when doing inference in english (but the sound of the voice is spot on). Every other voice clone thing I've tried doesn't seem to match the voice at all. I was hoping this would work but it seems not?

  • @kingroy8800
    @kingroy8800 4 หลายเดือนก่อน

    Not working sir make a new tutorial

  • @BuyTrustpilotReviews-vn8bu
    @BuyTrustpilotReviews-vn8bu 29 วันที่ผ่านมา

    Lee Brenda Miller Brian Clark Barbara

  • @xiunianwang
    @xiunianwang 9 หลายเดือนก่อน

    when I ran the cell 1,I got this error message
    Building wheel for docopt (setup.py) ... done
    ERROR: pip's legacy dependency resolver does not consider dependency conflicts when selecting packages. This behaviour is the source of the following dependency conflicts.
    lida 0.0.10 requires fastapi, which is not installed.
    lida 0.0.10 requires kaleido, which is not installed.
    lida 0.0.10 requires python-multipart, which is not installed.
    lida 0.0.10 requires uvicorn, which is not installed.
    librosa 0.10.1 requires numpy!=1.22.0,!=1.22.1,!=1.22.2,>=1.20.3, but you'll have numpy 1.22.0 which is incompatible.
    plotnine 0.12.4 requires numpy>=1.23.0, but you'll have numpy 1.22.0 which is incompatible.
    pywavelets 1.5.0 requires numpy=1.22.4, but you'll have numpy 1.22.0 which is incompatible.
    tensorflow 2.15.0 requires numpy=1.23.5, but you'll have numpy 1.22.0 which is incompatible.
    gruut 2.2.3 requires networkx=2.5.0, but you'll have networkx 3.2.1 which is incompatible.

  • @RenaDonovan-n4b
    @RenaDonovan-n4b 23 วันที่ผ่านมา

    Cremin Wells

  • @hornachos
    @hornachos 5 หลายเดือนก่อน

    you are a good girl, thanks coqui

  • @YevgeniyChannel
    @YevgeniyChannel 9 หลายเดือนก่อน

    I need help please

    • @torusx8564
      @torusx8564 7 หลายเดือนก่อน

      lol on what

    • @YevgeniyChannel
      @YevgeniyChannel 7 หลายเดือนก่อน

      To make effects and AI voices @@torusx8564

  • @HarringtonWebb-l5d
    @HarringtonWebb-l5d 15 ชั่วโมงที่ผ่านมา

    Walker Angela Lewis Brian Williams Deborah

  • @PeggyKing-t9v
    @PeggyKing-t9v หลายเดือนก่อน

    Hall Charles Lopez Betty Lewis Robert

  • @JohnHernandez-e6v
    @JohnHernandez-e6v 29 วันที่ผ่านมา

    Lewis Carol Williams Amy Young Charles

  • @yklandares
    @yklandares 9 หลายเดือนก่อน

    )))

  • @HyperUpscale
    @HyperUpscale 10 หลายเดือนก่อน +1

    I love the Coqui performance, results and ease of use, but Is it possible to be even easier? Like 1 file for input file for training or microphone input, button 2 for training, and 3 type and speak.
    I am not sure why in year 2024 we still need to copy and paste text ...

    • @james-hunter-carter
      @james-hunter-carter 9 หลายเดือนก่อน +2

      The thing you are looking at is not meant for end-users, it's for developers.

    • @BlenderBeanie
      @BlenderBeanie 9 หลายเดือนก่อน +2

      This is currently the peak of technology. The top of tech available to the public. It's the very first iteration of the ui too. So in the future it might get easier, the more people want to use it.
      Like automatic1111s ui used to be barebones and hard to use. But now it's become a lot more user friendly.
      Just a few months ago all this was pure command line

    • @HyperUpscale
      @HyperUpscale 9 หลายเดือนก่อน

      Maybe you just found out about it😄
      People are already making money from the same peak technology.
      I paid months ago for this type of peak technology and moths in the age of AI means long time ago :)

    • @BlenderBeanie
      @BlenderBeanie 9 หลายเดือนก่อน +3

      @@HyperUpscale Perhaps I should have clarified myself more. I meant the peak open source versions, that are accessable for everyone for free. If you take the older models of coqui for example, just a few months ago it would have taken you many hours to train a proper model that works well on any ways. A year ago even just a basic voice was considered a big step towards open source AI technology. I am well aware of AI-Voices being used for many many years now, however, technology of this caliber were not yet accessable to the everyday user for free, only through paid alternatives.

    • @HyperUpscale
      @HyperUpscale 9 หลายเดือนก่อน

      👍@@BlenderBeanie

  • @MaryLong-f1b
    @MaryLong-f1b 28 วันที่ผ่านมา

    Miller Thomas Martin Jennifer Smith Brenda

  • @MelissaJackson-l4g
    @MelissaJackson-l4g หลายเดือนก่อน

    Moore Donald Lopez Kevin Robinson Robert

  • @ZnxoBrill-j5e
    @ZnxoBrill-j5e 21 วันที่ผ่านมา

    White Jeffrey White Frank Garcia Margaret

  • @포화-f8e
    @포화-f8e หลายเดือนก่อน

    Allen Richard Harris Frank Johnson Deborah

  • @SgheGejsj
    @SgheGejsj หลายเดือนก่อน

    Anderson Steven Robinson Steven Lee Michelle

  • @HigherPowerMerch
    @HigherPowerMerch 21 วันที่ผ่านมา

    Harris Kevin Davis Donald Lee Michael

  • @WendyPennell-j7g
    @WendyPennell-j7g 21 วันที่ผ่านมา

    White Linda Johnson Jason Garcia Jessica

  • @MildredRubio-s6g
    @MildredRubio-s6g หลายเดือนก่อน

    Harris George Jackson Margaret Rodriguez Donald

  • @KonaAktar-t7h
    @KonaAktar-t7h 21 วันที่ผ่านมา

    White Michelle Miller Nancy Johnson Nancy

  • @AprilCollins-c1z
    @AprilCollins-c1z 16 วันที่ผ่านมา

    Rodriguez Mary Taylor Michael Gonzalez Brenda

  • @BaxterWheatleyD
    @BaxterWheatleyD 16 วันที่ผ่านมา

    Moore Susan Allen Joseph Jackson Daniel

  • @modabic
    @modabic 8 วันที่ผ่านมา

    If your notebook is broken, run '!pip install fastapi==0.111.0' after installations. It fixes it. Like the comment to show love.

    • @drewthomasson949
      @drewthomasson949 3 วันที่ผ่านมา

      lol share a link to your working colab plz idk which colab your referring to anyway theres multiple.

    • @drewthomasson949
      @drewthomasson949 3 วันที่ผ่านมา

      Also get a pip freeze of the working env so none of us have to run into this anymore lol

  • @ameerazam3269
    @ameerazam3269 10 หลายเดือนก่อน

    amazing @coqui