Fine Tuning XTTS v2 for Hindi Speech with forked Coqui TTS

แชร์
ฝัง
  • เผยแพร่เมื่อ 9 ม.ค. 2025

ความคิดเห็น • 17

  • @vivekgangurde9685
    @vivekgangurde9685 6 หลายเดือนก่อน +1

    really helpful video thanks for giving such informative videos Great work 👍 👏

  • @abhinavbisht9851
    @abhinavbisht9851 6 หลายเดือนก่อน +1

    thansk for the video but the audio still feels ai generated..with incorrect pauses..any way to make this as flawless as english ??

  • @mobiledevelooper
    @mobiledevelooper 2 หลายเดือนก่อน

    Could you please help me to decide what TTS model(s) is fit for faceless yt videos?

  • @ŁukaszMadajczyk
    @ŁukaszMadajczyk 2 หลายเดือนก่อน

    Would it be possible to show how to do Slavic language model?

  • @ŁukaszMadajczyk
    @ŁukaszMadajczyk 2 หลายเดือนก่อน

    Hello NanoNomad,
    do i need to first train HiFiGAN vocoder then Glow-TTS model with vocoder, or for Glow-TTS vocoder is not needed? I'm trying to train model for slavic language....
    Any sugestion would be appreciated... BTW. i'm new in this topic... :)

    • @nanonomad
      @nanonomad  2 หลายเดือนก่อน

      Hi,
      Sorry I missed your earlier comments. I'm not actively working on anything for this channel anymore. I don't have enough experience with GlowTTS to give a good answer for that. I found VITS, Tortoise, Yourtts, and Xtts easier to work with and train so I stuck with those.
      A lot of the scripts and methods used in the videos here are probably very out of date now. Coqui TTS as a company/project is no longer in business. There is a community fork of the Coqui source code that is still being updated, but I havent followed it closely. The community fork of Coqui does have XTTS fine tuning support, but I dont think it has slavic support out-of-the-box.
      For XTTS there is this project I found for training additional languages: github.com/anhnh2002/XTTSv2-Finetuning-for-New-Languages

  • @SaiLokesh-s5v
    @SaiLokesh-s5v 3 หลายเดือนก่อน

    How can we add a new language, so that we can clone in to that language using coquii

  • @tanishbajaj84
    @tanishbajaj84 5 หลายเดือนก่อน

    can you please share the text prompt you gave to generate the audio you shared in the video? was it in latin or devanagiri

    • @nanonomad
      @nanonomad  5 หลายเดือนก่อน

      I think the text prompts are stored in the config.json I just had to copy random sentences from an online learning document. I don't speak the language, so I have no idea what the sentence actually says. It was devanagiri though.

  • @adityajain2162
    @adityajain2162 5 หลายเดือนก่อน

    hey the overall audio sounds great but when there is a number in between the hindi text we get a muffled audio for the number part

    • @nanonomad
      @nanonomad  5 หลายเดือนก่อน

      There are no text cleaners in coqui tts for Hindi at all. You need to look at the coqui code and understand how the input is being handled. Numbers need to be written out in a verbal form until someone writes a proper text handler.

  • @tanishbajaj84
    @tanishbajaj84 5 หลายเดือนก่อน

    can i use this checkpoint through styletts2 by configuring the checkpoints and config to the one compared to this? also, whats the difference between config.json and config.yaml, what would be difference in say best_model_10759.pth and best_model_53795.pth

    • @nanonomad
      @nanonomad  5 หลายเดือนก่อน

      Styletts2 is a different model architecture and not compatible

  • @virajdeshwal9996
    @virajdeshwal9996 3 หลายเดือนก่อน

    Great stuff!

  • @vivekgangurde9685
    @vivekgangurde9685 6 หลายเดือนก่อน +1

    Can we clone the voice by using this ?

    • @nanonomad
      @nanonomad  6 หลายเดือนก่อน +1

      Every voice is a clone, because you need to supply reference audio samples when doing inference. Fine tuning just guides the model to being closer to the reference samples.
      There are no text cleaners for Hindi in coqui tts, so every number is going to need to be written/spelled out, no acronyms, etc.. someone probably needs to look at the punctuation handling in the text cleaner code for hindi to make sure the pauses are being handled correctly.

  • @abhinavbisht9851
    @abhinavbisht9851 6 หลายเดือนก่อน +1

    hindi audio is indeed not good.. english is really good....