Thanks for sharing. However, the way two speakers interact with each other in google's tools like NotebookLM are on another level of realness. I think they use soundstorm, which they announced as an audio generation method, rather than using old TTS methods. That genuine interaction makes it very realistic sounding
Yes for sure they use newer technology there. Have you seen the latest release to have two voices synthesized in one go? cloud.google.com/text-to-speech/docs/create-dialogue-with-multispeakers Apparently the same tech used with notebooklm
Did it worked for anyone finally at the end? I've spend over 8 hours in this and yet for some reasons it doesn't run, it's like I'm just solving the errors only in this but it is not running
Hi Kavish from the errors you shared I see you didnt install the pip dependencies. Like pyaudio and streamlit. "no module named" means you need to install it.
Yes no support for SSML. Initially I had planned to let the Gemini model also generate proper SSML for the speech. Unfortunately no support for that on the speech synthesis side. But the good thing is we are now getting those voices finally in more languages.
Thanks my brother for your time and effort , i tried to work for almost an hours but i am a begginer coder and not sure how do i get the articles folder as it is not synced while i clone the git . Hope you can throw a tip on how to get one of the podcast transcribes and put it in the generator:) Many thanks! Dor P.S i siged with bell to support you :)
Hi thanks for the follow. You don't need the articles folder. In the generate.py reference your own .txt file. That's all you need. You can create a txt file yourself and add the content you want to have a podcast created for into it.
It's also not a podcast transcript that's in the articles folder. I just copied one of my articles into a txt that's all that you need any kind of text about a specific topic works.
@@ml-engineerI've tried it but I'm stuck on implementing gemini flash 002, it says estimated price is $2700 per sdk? I don't want to pay that.. is it pay as u go or that price per month?
@iconicglashan7903 what do you mean per SDK? Gemini is billed by the number of tokens/ characters you send and generate. 2700$ is impossible for just one API call. Where did you got this number from?
@@ml-engineerin Google cloud storage, under vertex ai api, then within the model garden, I chose Gemini, then i can't just deploy it I have to place an order which then gives me those numbers.. I must be doing it wrong
Thanks for sharing. However, the way two speakers interact with each other in google's tools like NotebookLM are on another level of realness. I think they use soundstorm, which they announced as an audio generation method, rather than using old TTS methods. That genuine interaction makes it very realistic sounding
Yes for sure they use newer technology there. Have you seen the latest release to have two voices synthesized in one go? cloud.google.com/text-to-speech/docs/create-dialogue-with-multispeakers
Apparently the same tech used with notebooklm
Informative! Podcast generator put podcaster out of business!
Did it worked for anyone finally at the end? I've spend over 8 hours in this and yet for some reasons it doesn't run, it's like I'm just solving the errors only in this but it is not running
Hi Kavish
from the errors you shared I see you didnt install the pip dependencies. Like pyaudio and streamlit. "no module named" means you need to install it.
Hi Sasha, any plans to do a google colab?
Journey voices have better prosody but they do not support SSML. Also they can omit words or insert random words erratically.
Yes no support for SSML. Initially I had planned to let the Gemini model also generate proper SSML for the speech. Unfortunately no support for that on the speech synthesis side. But the good thing is we are now getting those voices finally in more languages.
Thanks my brother for your time and effort , i tried to work for almost an hours but i am a begginer coder and not sure how do i get the articles folder as it is not synced while i clone the git .
Hope you can throw a tip on how to get one of the podcast transcribes and put it in the generator:)
Many thanks!
Dor
P.S i siged with bell to support you :)
Hi thanks for the follow. You don't need the articles folder. In the generate.py reference your own .txt file. That's all you need. You can create a txt file yourself and add the content you want to have a podcast created for into it.
It's also not a podcast transcript that's in the articles folder. I just copied one of my articles into a txt that's all that you need any kind of text about a specific topic works.
@@ml-engineer Many thanks for your replay i really appreciate it
@@DIY4Profit did it worked for you finally at the end?
Your AI voice is better than your real voice.
If AI-me sounds that good, maybe it should start paying my bills too.
Did you enjoy watching? Have you tried it?
@ what I meant was, your cloned AI voice sounds much more self-confident and it sounded just perfect. Very relaxing good voice to listen to. 😊👍
@@ml-engineerI've tried it but I'm stuck on implementing gemini flash 002, it says estimated price is $2700 per sdk? I don't want to pay that.. is it pay as u go or that price per month?
@iconicglashan7903 what do you mean per SDK? Gemini is billed by the number of tokens/ characters you send and generate. 2700$ is impossible for just one API call. Where did you got this number from?
@@ml-engineerin Google cloud storage, under vertex ai api, then within the model garden, I chose Gemini, then i can't just deploy it I have to place an order which then gives me those numbers.. I must be doing it wrong
spiderman?
@@AjayCoding haa yes. Halloween 🎃 with the kids.