This free AI Text-to-Speech is insane! Add emotions & make podcasts

แชร์
ฝัง
  • เผยแพร่เมื่อ 24 พ.ย. 2024

ความคิดเห็น • 732

  • @theAIsearch
    @theAIsearch  หลายเดือนก่อน +38

    Thanks to our sponsor Wondershare Filmora, a user-friendly video editor supercharged with AI features. bit.ly/4f60nmB

    • @crazyguy7585
      @crazyguy7585 หลายเดือนก่อน +1

      i have AMD Radeon 6800XT graphic card. i don't have CUDA what will i do can u tell me please help me.

    • @jalfagemer
      @jalfagemer หลายเดือนก่อน

      It's amazing! Thanks a lot. Do you know if it will be an Spanish option to generate speech? Thanks

    • @jarkodev
      @jarkodev หลายเดือนก่อน

      yes me too have rx 6800
      im only have amd gpu how to intall it
      i need this ai voice cloner
      please help

    • @Drowe71
      @Drowe71 หลายเดือนก่อน

      Says this was posted 6 days ago but when I go to the site its got different setup, so links are gone or changed etc. So do we assume they have added some of the features into the main installation process like the requirements?

    • @Unleashing-Innovation
      @Unleashing-Innovation หลายเดือนก่อน

      Hey there is a problem after reinstalling miniconda3 and checking the script folder I was not able to find conda.exe I would appreciate it if you can provide a solution

  • @steve-g3j6b
    @steve-g3j6b หลายเดือนก่อน +75

    I love how you do not assume that I know what you know, and bothered explaining the basics. and made time stamps for the more knowledgeable to skip. excellent man!!!
    so we cant train it properly on a larger audio file (you cant pack enough vocal range in that for professional works..

  • @Dryesthalo
    @Dryesthalo หลายเดือนก่อน +133

    This is wild! It’s crazy how little input audio it requires. Also I just wanted to say thanks. If it weren’t for you I would have never discovered my passion for creating AI voice models!

    • @amitnishad0777
      @amitnishad0777 หลายเดือนก่อน

      are you making money out of it? will be very helpful if you can give some insights.

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน +1

      You're welcome! Glad you found your passion

    • @jmg9509
      @jmg9509 หลายเดือนก่อน +1

      That's definitely a new passion no one prior to 5 years ago could say, i'll tell you that much.

    • @life2030
      @life2030 หลายเดือนก่อน

      @@amitnishad0777 It is for non-commercial use only.

    • @Dryesthalo
      @Dryesthalo หลายเดือนก่อน

      @@amitnishad0777No, I guess I could do commissions but I haven’t really thought much about it. I also want to improve before I do something like that as I’m to amateurish at data cleaning atm.

  • @HeRmEtIkA666
    @HeRmEtIkA666 หลายเดือนก่อน +24

    I follow your channel since the early days. I´m super happy for your growth and also super happy when you do content like this... for non-tech people to be able to try and have fun with AI. A dedicated video for everyone to follow. Keep up the good stuff!

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน

      Thank you so much!!

  • @cippalippa3105
    @cippalippa3105 หลายเดือนก่อน +11

    I watch lots of tutorials on youtube. This one is among the best. Keep up the good work and thanks for sharing your know-how!

  • @AdvantestInc
    @AdvantestInc หลายเดือนก่อน +44

    Voice synthesis with emotions? That’s a next-level breakthrough for personalizing user experiences. Feels like we're inching closer to seamless AI-human conversations.

    • @homuchoghoma6789
      @homuchoghoma6789 หลายเดือนก่อน +3

      Нам сначала нужно приблизиться к беспрепятственному общению между человеком и человеком )

    • @AlterRizz
      @AlterRizz 29 วันที่ผ่านมา +3

      ​@@homuchoghoma6789we already talk however we want. All the barriers are in our own heads

    • @Klarence75
      @Klarence75 12 วันที่ผ่านมา

      curious if they moan

    • @homuchoghoma6789
      @homuchoghoma6789 9 วันที่ผ่านมา

      ​@@AlterRizzТы часто смотришь японские, корейские, итальянские, британские, французские, немецкие и прочие каналы с синхронной озвучкой на понятном для тебя языке?

    • @AlterRizz
      @AlterRizz 9 วันที่ผ่านมา +1

      @@homuchoghoma6789 I watch in many languages cuz i know many languages. Not knowing 3+ languages in 21 century is a skill issue

  • @vinching926
    @vinching926 หลายเดือนก่อน +23

    That mixing Chinese and English is simply perfect, any Chinese no matter it's Mandarin even Cantonese just speaks like that, the TTS shows no flaw with it's voice, tone and pronunciation, if I play that to my friends and family they can't really spot the common AI characteristics with it.

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน +1

      thanks for sharing!

  • @Rodentsnipe
    @Rodentsnipe หลายเดือนก่อน +44

    If you generate anything longer than 10 minutes, you'll notice that the voice model gets worse and worse until it becomes absolute gibberish and then static noise at around an hour

    • @CodewithRiz
      @CodewithRiz หลายเดือนก่อน

      yes. i tested it .
      did you know how eleven lab doing this

    • @TranVanong132
      @TranVanong132 13 วันที่ผ่านมา +7

      @@CodewithRiz I guess they split the given text into multiple parts, generate for each one, then merge them into one file.

    • @DLuzElAngelMusikal
      @DLuzElAngelMusikal 3 วันที่ผ่านมา

      @@CodewithRiz do you think that 11 labs is using this same LLM?

    • @twenibucks1119
      @twenibucks1119 วันที่ผ่านมา

      ​@@DLuzElAngelMusikalno Eleven Labs supports 32 languages. F5 tts supports, and is trained on and for English and Chinese only

  • @sunnyhaoshiyu9728
    @sunnyhaoshiyu9728 หลายเดือนก่อน +48

    The thumbnail man 🤣 man of culture! like and sub!

    • @lyrioolyricc
      @lyrioolyricc หลายเดือนก่อน

      He changed it what was it😭?

    • @rafikzidane477
      @rafikzidane477 หลายเดือนก่อน +2

      yamete kudasai ...🌶🤣

  • @gyro-j
    @gyro-j หลายเดือนก่อน +2

    I'll save it for later. Thank you so much for the detailed tutorial man! Your channel is excellent!

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน +1

      you're welcome!

  • @kevinisawake
    @kevinisawake 24 วันที่ผ่านมา +3

    I have subscribed just because of this video - man - what a find. Great work.

    • @theAIsearch
      @theAIsearch  24 วันที่ผ่านมา

      Thank you!

  • @adrianmunevar654
    @adrianmunevar654 หลายเดือนก่อน +4

    Man, your channel is the bomb 💣
    And right, that "Spanish" reading was a little bit hilarious and awful at the same time. Hope they make more languages available soon.
    3 of your videos in a row. New subscriber here!

  • @pauleasther
    @pauleasther หลายเดือนก่อน +16

    After a break, I deleted all uploaded files and started again, this time successfully. First error was when uploading programs, stick to the older nominated versions! Don't think that by uploading a newer version, things will be better, they won't ! The program is brilliant and will save me a lot of money. Thankyou! Where I went wrong was creating the virtual environment? You sat to add "conda activate f5"; but you must put in "conda init" first, hit enter, and then add "conda activate f5" Once done, it went smoothly

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน +2

      thanks for sharing!

    • @Vojec9
      @Vojec9 15 วันที่ผ่านมา +1

      Thank you for that little note "init".

    • @IntiArtDesigns
      @IntiArtDesigns 14 วันที่ผ่านมา

      @@Vojec9 *laughs in Bri'ish*

    • @leodark_animations2084
      @leodark_animations2084 14 วันที่ผ่านมา +1

      i have the same issue but after conda init and typing conda activate it says again type conda init first. . .

    • @pauleasther
      @pauleasther 14 วันที่ผ่านมา

      @leodark_animations2084 Sorry to hear that. Afraid I'm no expert and just stumbled my way through. I'd just shut the computer down and restart, see how you go?

  • @mohamedzewail8907
    @mohamedzewail8907 หลายเดือนก่อน +36

    Great but needs to support more languages.

  • @proteusblack8913
    @proteusblack8913 หลายเดือนก่อน +3

    Gotta love installing installers for installing installers in an installer that installs the installer needed for a virtual environment used for installing an installer for a tts program. 👍

  • @ninjagogetta
    @ninjagogetta หลายเดือนก่อน +13

    this is great for npcs in video games

  • @VoicelessScream
    @VoicelessScream หลายเดือนก่อน +1

    Truly appreciate the detailed installation procedure, made my life much easier. Thanks!

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน

      you're welcome!

  • @lorenndesign
    @lorenndesign หลายเดือนก่อน +281

    WW thumbnail

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน +63

      😏

    • @pelufaz8435
      @pelufaz8435 หลายเดือนก่อน +12

      Sauce

    • @Musical_thinker
      @Musical_thinker หลายเดือนก่อน +2

      +1

    • @Oimamanaplz
      @Oimamanaplz หลายเดือนก่อน +3

      What the thumbnail

    • @starbrandX
      @starbrandX หลายเดือนก่อน

      @@theAIsearch I don't remember if you mentioned your hardware. Can it do inference fast enough for realtime tts of text streams?

  • @liarus
    @liarus หลายเดือนก่อน +29

    That thumbnail... He knew what he was doing

    • @jaredf6205
      @jaredf6205 หลายเดือนก่อน +3

      What about it? Can you guys hear waveforms by looking at a picture of them or something?

    • @IPutFishInAWashingMachine
      @IPutFishInAWashingMachine หลายเดือนก่อน

      A computer could probably

    • @AimaruVee
      @AimaruVee หลายเดือนก่อน +2

      ​​@@jaredf6205 I think he had like AB testing going on in the thumbnail. One is a normal wavelength thumbnail and the other thumbnail also has a wavelength pic paired with a.. sus anime pic.

    • @jaredf6205
      @jaredf6205 หลายเดือนก่อน +1

      @@AimaruVee Oh it's actually coming up for me now

    • @Watchdog-e9f
      @Watchdog-e9f หลายเดือนก่อน +1

      Ai girlfriends are becoming a reality we are doomed 😭😭😭

  • @Sha_1k.s
    @Sha_1k.s 25 วันที่ผ่านมา

    Ive been looking for this for a long time so thank you👍🏼

  • @froilen13
    @froilen13 หลายเดือนก่อน +87

    sounds good, but not good enough. I'll wait a bit longer for an upgrade

    • @nonsookoye3163
      @nonsookoye3163 หลายเดือนก่อน +8

      Right! Not good enough. I can tell it's ai

    • @InnerEagle
      @InnerEagle หลายเดือนก่อน +33

      You are lucky you can tell it's AI, wait until you get a phone call and you can't understand if it's fake or AI

    • @Eldorado66
      @Eldorado66 หลายเดือนก่อน

      Surely the best among the free ones. If you want the absolute best and are willing to pay, try eleven labs.

    • @hilmiterzi6369
      @hilmiterzi6369 29 วันที่ผ่านมา

      @@InnerEagle Maybe that's the thing he tries to build?

    • @InnerEagle
      @InnerEagle 29 วันที่ผ่านมา

      @@hilmiterzi6369 Right, then he has to wait for it

  • @Felaonile
    @Felaonile หลายเดือนก่อน +27

    finally my local voice AI companion will have emotions!

    • @jahpistol3486
      @jahpistol3486 หลายเดือนก่อน +24

      Why do i have a bad feeling about this

    • @Felaonile
      @Felaonile หลายเดือนก่อน +4

      @@jahpistol3486 bro 💀

    • @captteemo9133
      @captteemo9133 หลายเดือนก่อน +1

      How do you use it on mobile phones?

    • @Felaonile
      @Felaonile หลายเดือนก่อน

      @@captteemo9133 I built the bot from scratch, the basis of my bot is Ollama, for fast communication I used Llama3.2 with 1B parameters. Speech recognition works on Whisper, I used to work with VOSK, VOSK is not inferior by the way, only Whisper allows you to insert punctuation marks into speech. Speech synthesis is based on COQUI TTS - VITS multi-voice model. Unfortunately, it will not work on a smartphone

    • @Felaonile
      @Felaonile หลายเดือนก่อน

      @@captteemo9133 I built the bot from scratch, the basis of my bot is Ollama, for fast communication I used Llama3.2 with 1B parameters. Speech recognition works on Whisper, I used to work with VOSK, VOSK is not inferior by the way, only Whisper allows you to insert punctuation marks into recognized text. Speech synthesis is based on COQUI TTS - VITS multi-voice model. Unfortunately, it will not work on a smartphone

  • @411KJB
    @411KJB หลายเดือนก่อน

    Sir, YOU ARE AMAZING. BELLED to "GIT" notified of everything you make. Simply WOW.

  • @codeslacker77
    @codeslacker77 26 วันที่ผ่านมา +2

    This is absurdly crazy~!!! Many thanks for the installation walkthrough ~!!!

    • @theAIsearch
      @theAIsearch  26 วันที่ผ่านมา

      you're welcome!

  • @cosassobrealgo2762
    @cosassobrealgo2762 18 วันที่ผ่านมา

    This is amazing and so well explained, thanks !

  • @4.0.4
    @4.0.4 หลายเดือนก่อน

    I'm glad that this is being developed, even if it's still at a point where I wouldn't even enable it if it was as easy as a toggle, let alone dig into code to get it working.

  • @critiqsai
    @critiqsai หลายเดือนก่อน

    Great tutorial. Added to our best of AI.

  • @bause6182
    @bause6182 หลายเดือนก่อน +17

    I hope one day someone make an open source ai that make songs like suno or udio

  • @brianlink391
    @brianlink391 หลายเดือนก่อน +55

    This AI is really good...at sounding like a bad audiobook narrator! 😂 It nails those over-the-top emotions, but they don't sound very human. Maybe the problem is that it's trained on audiobooks, where the emotions are often exaggerated.
    What if we used this "fake emotion" data to our advantage? First, train an AI to recognize those audiobook patterns. Then, train a second AI to spot real emotions in everyday speech from TH-cam, podcasts, etc. The second AI could learn to tell the difference between fake and genuine, and we'd get an AI that truly understands how we express emotions! What do you guys think?

    • @samuel_innerwinkler
      @samuel_innerwinkler หลายเดือนก่อน +3

      Have you tried the eleven labs reader for audio books? Not all voices are great but i foubd the voice of burt Reynolds to work really well for audiobooks. It also works in different languages

    • @jmg9509
      @jmg9509 หลายเดือนก่อน

      @@samuel_innerwinkler Lol Burt Reynolds was an actor.

    • @jmg9509
      @jmg9509 หลายเดือนก่อน

      I think that's what a lot of these AI models use. It's called a discriminator, and it's just is to do just that; tell the determine whether a piece (image, audio, etc) is genuine or ai generated. That's the base of my knowledge, I don't know much after that, or if they use it for this voice model.

    • @samuel_innerwinkler
      @samuel_innerwinkler หลายเดือนก่อน

      I know​@@jmg9509

  • @sasuofficial3448
    @sasuofficial3448 หลายเดือนก่อน +13

    i alaway wonder why the requirements are never listed first ... xD (specs vram/ram req)
    the chinese is insane . it always sounds more than the original voice lol

  • @Anamontes-o4w
    @Anamontes-o4w หลายเดือนก่อน +1

    I don't know where to go without you. You don't know how important you are in my life. Saved for later as usual.

  • @adelite
    @adelite หลายเดือนก่อน +3

    Crazy stuff! I'm glad i found this channel.

  • @russellfrancis6294
    @russellfrancis6294 หลายเดือนก่อน

    I'm really excited about this. I'd love to have a go !

  • @ExpensivePizza
    @ExpensivePizza หลายเดือนก่อน +10

    This is very impressive.

  • @weltonbarbosa206
    @weltonbarbosa206 29 วันที่ผ่านมา

    that was an awesome tutorial, very didactic, congratulations.

  • @Random_person_07
    @Random_person_07 หลายเดือนก่อน +4

    Awesome it sounds good thanks for the guide to set it up Nvm it sounds alright it has a lot of hallucinations

  • @flip2dip
    @flip2dip 19 วันที่ผ่านมา +4

    Awesome, but please mention stuff like needing a CUDA supported GPU earlier in the video. Followed all steps up until I realized I couldn't use it :')

    • @neuron8950
      @neuron8950 9 วันที่ผ่านมา +1

      Same XD

    • @NextGenGamezz
      @NextGenGamezz 9 วันที่ผ่านมา

      same

    • @NextGenGamezz
      @NextGenGamezz 9 วันที่ผ่านมา

      @@neuron8950 same lol amd sucks

  • @tp_exe
    @tp_exe หลายเดือนก่อน

    I couldnt stop laughing with sudden switch from normal to sad and then to anger LMAO

  • @thesystemera
    @thesystemera หลายเดือนก่อน +5

    Damn. I work extensively with Eleven Labs but this is actually showing some advances. Especially the emotional side of things.

    • @rickarroyo
      @rickarroyo หลายเดือนก่อน +1

      There was a promise about updates with emotions, right? So far, nothing.
      With ElevenLabs we need to try some workarounds like:
      (And she says with great sadness) or something like (She says with great anger)
      Insert the text -
      The context helps, this uses more characters but in some tests it was worth it for me.

  • @sabofx
    @sabofx หลายเดือนก่อน

    really awesome tutorial!

  • @bumkailashkumar
    @bumkailashkumar 2 วันที่ผ่านมา

    thanks for the steps wise explanation great with complete info

    • @theAIsearch
      @theAIsearch  วันที่ผ่านมา

      you're welcome!

  • @vi6ddarkking
    @vi6ddarkking หลายเดือนก่อน +11

    The best part is we can use the existing XTTS set of tools to modify our own voices and create the emotional samples, for the existing voices.

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน +1

      thanks for sharing!

    • @zakyvids6566
      @zakyvids6566 หลายเดือนก่อน

      How though I do not know coding would be very interested if you can put a TH-cam channel on this very topic

  • @SkylineAICreator
    @SkylineAICreator หลายเดือนก่อน +1

    I was also very surprised with how good this works... Thanks!

    • @ithurtsbecauseitstrue
      @ithurtsbecauseitstrue 25 วันที่ผ่านมา

      yes, theft and fraud are a button click away and justified with a shrug of your smug shoulders.

  • @VaibhavShewale
    @VaibhavShewale หลายเดือนก่อน +2

    it took a quite a while for people to find this

  • @Me91325
    @Me91325 หลายเดือนก่อน

    this is the best AI text to speech program i ever seen. tnx AI Search...😍😍😍😍😍

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน

      You are welcome!

    • @angelbeatsenpai_manhwa
      @angelbeatsenpai_manhwa 28 วันที่ผ่านมา

      Is it free​@@theAIsearch

    • @theAIsearch
      @theAIsearch  28 วันที่ผ่านมา

      @@angelbeatsenpai_manhwa yes

  • @gdizzzl
    @gdizzzl 6 วันที่ผ่านมา

    Thanks, i got it working and im a smooth brain.

    • @theAIsearch
      @theAIsearch  5 วันที่ผ่านมา

      You're welcome!

  • @JRo250
    @JRo250 28 วันที่ผ่านมา

    This has got to be the best explanation and breaking down of an objectively nightmarishly complicated setup anywhere. Congrats!
    You left NO stone unturned. "Oh no Python? Let me take you to the page where to get it, run the setup with you, and show you the gotchas and workarounds before we go on to the next step". Absolutely brilliant. Most other "step-by-step" guides pull out a black box and point at how some magic happens there and good luck figuring it out lol.
    I'll also note that it must have taken you forever and a day to get ready for this, write the script/steps, collect all the links, files, test it, narrate the entire thing, edit it, and publish it. Your Wondershare Filmora sponsor got their money's worth, and then some.
    Now.... why in the world hasn't someone taken all this stuff, and made a nice Windows app that installs with a single click? 😁

  • @cyberprompt
    @cyberprompt หลายเดือนก่อน

    I approve of the Hitchhiker's Guide reference.

  • @rickfuzzy
    @rickfuzzy หลายเดือนก่อน +3

    Looks great, but the only thing i wanted to know was inference speed without processing the reference. What would the potential be for realtime if the reference voice was not being processed as part of the inference?

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน +1

      inference is quite fast. there's a good chance someone might make a realtime variant of this

    • @phizc
      @phizc หลายเดือนก่อน +2

      I haven't looked at it yet, but it shows a spectrogram of the clip¹, so it's possible/probable that it generates the entire clip in one go, I.e. it works on every part of the clip at the same time. If that's the case, it could probably create a 20 second clip in e.g. 15 seconds, but you would still have to wait 15 seconds before you can hear any of it. I may be wrong though.
      ¹ some text to audio systems generates an image of the spectrogram and then converts the spectrogram to an audio file. The spectrogram is a representation of the audio where time is on the x-axis, the frequency is on the y-axis, and the amplitude is the intensity/color of the pixel.

  • @alienspecies6872
    @alienspecies6872 หลายเดือนก่อน +1

    Thumbnail-kun goin cray

  • @contentfreeGPT5-py6uv
    @contentfreeGPT5-py6uv หลายเดือนก่อน +3

    XD waoo,list for my open free project 😅

  • @VintageForYou
    @VintageForYou หลายเดือนก่อน

    This is very good at cloning voice wave files nice one.👍😁💯

  • @yanggary
    @yanggary หลายเดือนก่อน

    OMG this is 🔥thank you!

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน

      No problem!!

  • @Faceless_Mailjam
    @Faceless_Mailjam 28 วันที่ผ่านมา

    This is outstanding
    I will try this

  • @laultimaverdad1187
    @laultimaverdad1187 หลายเดือนก่อน +4

    COOL bro, please I want Spanish TTS and cloning

  • @SherryXShi
    @SherryXShi หลายเดือนก่อน

    thanks for sharing your skill with us.

  • @PCB389
    @PCB389 21 วันที่ผ่านมา

    this is amazong, thanks

  • @rachidlajmi2826
    @rachidlajmi2826 หลายเดือนก่อน

    Thank you for the details

  • @jihe4677
    @jihe4677 หลายเดือนก่อน

    This version of the tool is astonishing! It is exactly what I have been looking for.Thank you!

  • @Endangereds
    @Endangereds หลายเดือนก่อน +1

    There was some any language to any language AI voice tool too. Does anyone remember? We can just feed it any language voice and it will learn from it, and after that step it can the be used to generate voice to speak in any language. I believe, It was possible to make it sing too. it even creates a tts file, I believe. So that, we can use that file with any text to speech engine.

  • @CrRonaldo-rq6mv
    @CrRonaldo-rq6mv หลายเดือนก่อน +1

    Man , You're a legend 🙌
    Thank you for your efforts ❤

  • @tiitola
    @tiitola หลายเดือนก่อน

    Great video Very helpful.Thanks for sharing

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน

      You are welcome!

  • @syedthefunnyguy7570
    @syedthefunnyguy7570 หลายเดือนก่อน +5

    now reading visual novels feels cinematic, thanks for suggesting

  • @realthing2158
    @realthing2158 หลายเดือนก่อน +3

    It's really cool but I need it to be able to blend multiple voices together to create a new original one. Just copying other people's voices is not really ethical when using voices for commercial purposes.

    • @armondtanz
      @armondtanz หลายเดือนก่อน

      RVC can do this. I downloaded via this channel. If u go to this guys videos and search popular it's on of the most watched.

  • @vi6ddarkking
    @vi6ddarkking หลายเดือนก่อน +5

    So what do you think?
    What's the ETA for this to be added to Sillytavern?

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน +5

      should be very soon. open source community builds fast!

  • @MolnarG007
    @MolnarG007 หลายเดือนก่อน

    Crazy!
    i'm interested in the cross language options, and generally how it handles other non English languages. EDIT: just reached end, so it's Chinese and English support at the moment.
    All in all, thx for the upload definately checking this out!

  • @MarkoKarja_SlapAsSound
    @MarkoKarja_SlapAsSound หลายเดือนก่อน +1

    Thank you! Installation part of this tutorial is about 15 minutes long...? Is there a way how regular people can install this software? :)

  • @CaptainSnackbar
    @CaptainSnackbar หลายเดือนก่อน +1

    This is inane bro, i used to train a model for hours to get something near this level

  • @Atul_25
    @Atul_25 27 วันที่ผ่านมา

    This is dope 👏

  • @fixelheimer3726
    @fixelheimer3726 หลายเดือนก่อน +4

    This needs more languages

  • @mmmmmmmmicoooooooo
    @mmmmmmmmicoooooooo หลายเดือนก่อน +8

    0:22 Why "bob" sounds like Vedal 💀

  • @WIDOMU
    @WIDOMU หลายเดือนก่อน

    That's awesome!

  • @benashbaugh5982
    @benashbaugh5982 หลายเดือนก่อน

    This is impressive. No wonder the voice actors have problems with this software

  • @IM2awsme
    @IM2awsme หลายเดือนก่อน +3

    I remember a product called lier bird that vanished from existence 😅 it did voice cloneing almost a decade ago.

    • @TrentonMatthews
      @TrentonMatthews หลายเดือนก่อน

      It did, and it was fun!!!
      You can find absolutely funny examples over on The Lost Narrator's TH-cam channel.
      Yeah, it's My Little Pony voice examples from fan actresses, but I say they are some of the best clips I have found.

    • @IM2awsme
      @IM2awsme หลายเดือนก่อน

      @@TrentonMatthews I remember someone showing me a website with my little pony voice clones so many years back, I completely forgot that existed 😅

    • @IM2awsme
      @IM2awsme หลายเดือนก่อน

      @@TrentonMatthews the first video I randomly clicked on was titled "apple jack tells the truth " 💀 was not expecting that

    • @zakyvids6566
      @zakyvids6566 หลายเดือนก่อน

      Yep and does anyone remember adobe voco it could do cloning as well as emotions it was very real for 2016 I bet the big tech already has very advanced stuff in their labs

  • @FreeRadical3001
    @FreeRadical3001 21 วันที่ผ่านมา

    good video. question, where did you get the original emotive voice samples to use (angry, sad, excited, etc??)

  • @cheezeckez6843
    @cheezeckez6843 หลายเดือนก่อน

    The Chinese ones sounds like native speakers. This is a really powerful tool.

  • @Legion831
    @Legion831 หลายเดือนก่อน

    Thank you so much as always your tutorials are very helpful and insightful. I hope to use this to translate and dub the new Dragon Ball series.

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน

      good luck!

  • @am_9944
    @am_9944 17 วันที่ผ่านมา

    voice actors exiting the chat

  • @nexusyang4832
    @nexusyang4832 26 วันที่ผ่านมา

    39:39 - for the podcast mode, i wonder if they will add in the feature for you to provide emphasis, i.e. the moods from before.

  • @Anish-o6n
    @Anish-o6n หลายเดือนก่อน

    Your knowledge is awesome, what was your profession before before starting this fabulous channel ?🤔

  • @MilanKarakas
    @MilanKarakas 4 วันที่ผ่านมา

    Yes, sometimes it hallucinate after text to generate is there. But, then... one should to adjust the speed and the cross fade duration and repeat synthesize.

  • @dokifisher967
    @dokifisher967 24 วันที่ผ่านมา

    thank you for the tutorial, just wanted to let you know I managed to get it working on a 2060 laptop (6 GB VRAM) and it works fast as well. Also I wasn't sure if pytorch has the latest cuda, but it works with 118

  • @user74018
    @user74018 หลายเดือนก่อน

    Thank you for the effort in explaining this topic, but the video is too long with a lot of unnecessary examples. the point was clear early on, so trimming the extras and making it more concise would really improve both the content and the viewing experience.
    Hope you'd see this feedback ;)

  • @Megumi_fushiguro1.
    @Megumi_fushiguro1. 19 วันที่ผ่านมา +1

    Best thumbnail!, sauce?

  • @mounirbousli4323
    @mounirbousli4323 23 วันที่ผ่านมา

    at 38min it feels like it was the movey inside out :D

  • @Ai.PromptHero
    @Ai.PromptHero หลายเดือนก่อน

    So good sir 😊❤

  • @31eGGy31
    @31eGGy31 23 วันที่ผ่านมา

    Software seems great! Do you know wheter or not it can handle Subtitle formats, with timestamps declaring exactlly at what time something is spoken, or stumbled apon any other text to speech tool that can do that in your research so far? Reply would be much appreciated :)

  • @OmIwanReaction
    @OmIwanReaction 9 วันที่ผ่านมา

    help me i have cuda version 6.1 . is it compatible? my laptop use (Nvidia MX-150)

  • @fuzzyhenry2048
    @fuzzyhenry2048 หลายเดือนก่อน +14

    Feel like a 7/10.

    • @generalfishcake
      @generalfishcake หลายเดือนก่อน

      6/10 maybe. Still more robotic than CoquiTTS

  • @dom6512
    @dom6512 หลายเดือนก่อน +1

    What GPU are you running? Your 30 seconds is around 5000 for me. I tried on huggingface with about the same results. Replicate was at about the speeds you were getting.

  • @everyoneroasted
    @everyoneroasted หลายเดือนก่อน

    it doesn't sound like a human at all, but it really nailed these emotions and I can see it over taking Eleven labs if they keep developing it

  • @davidmangold6167
    @davidmangold6167 หลายเดือนก่อน

    Love your video! So cozy to listen to your voice :). I was wondering if you tried with your own voice? If yes did it work? 😊

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน

      thanks! i haven't actually tried it w my voice, but good idea!

    • @fi1689
      @fi1689 27 วันที่ผ่านมา

      Well, you can 100% make his voice read you books now lol

  • @rspy24
    @rspy24 18 วันที่ผ่านมา

    "Insane AI TTS with Emotions!"
    *proceeds to play the most monotone TTS voice I heard* 🤣🤣

  • @RangersClub2015
    @RangersClub2015 3 วันที่ผ่านมา

    ngl i clicked because of the thumbnail, not the tutorial, i'm cooked

  • @GerritSchulze
    @GerritSchulze 18 วันที่ผ่านมา

    Good instructions!
    Can f5-TTS possibly be run without CUDA, just on the CPU?

    • @GerritSchulze
      @GerritSchulze 16 วันที่ผ่านมา

      Yes, it can. It is only much, much slower. Skip the torch GPU choice and install all the generic torch libraries.

  • @Strenkoo
    @Strenkoo 10 วันที่ผ่านมา +1

    18:00 The dependency installation process has been changed. Instead of entering 'pip install -r requirements.txt', you'll want to enter 'pip isntall -e .'

  • @gwinbeer
    @gwinbeer หลายเดือนก่อน +1

    This is great! My all time favorite voices are Morgan Freeman, Peter Thomas (from Forensic Files), and Samuel L Jackson (see: Go the f**k to sleep)

  • @kngemeral
    @kngemeral หลายเดือนก่อน +1

    great video

  • @goocheese8517
    @goocheese8517 28 วันที่ผ่านมา

    This is great but how many languages can we use in this or just limit some languages?

  • @LatoreLyfe
    @LatoreLyfe หลายเดือนก่อน

    Great video thank you for the education

    • @theAIsearch
      @theAIsearch  หลายเดือนก่อน

      you're welcome!