Hey Bart, I appreciate the time and effort you put into this and the real-time api videos. I believe that the OpenAI voices are not build really for the customer service use case. Perhaps you could look into retellAI or BlandAi as a sort of continuation of you series on Real-time voice assistance..?! I believe that their voice would be much more suitable for the customer service use cases. Anyway thanks for all the effort. Great stuff!
Thank you legend 🙏 Good suggestion on retellAI - I've been hearing more and more about this and could be a really interesting next step. Thanks for the support my man!
Thanks for covering the new voices Bart. I do like the Sage voice also. I'm testing to use for my business. I have different twilio numbers for each ad so a different prompt relative to that ad. I need to test asking the assistant to recognize the language of the caller and reply in that language back. Lots of Spanish speakers in my area.
My pleasure legend :) One thing I didn't show in my tutorials, but might be useful for you; go to this link: platform.openai.com/docs/api-reference/realtime-client-events/session/update You'll see the configuration (the code on the RHS) and that is similar to what we set in the previous replit code. One thing I don't set in my code is the "instructions" - which by default are set to something like "Your knowledge cutoff is 2023-10. You are a helpful assistant." It would be good to test passing through custom instructions here too (seperate from the main prompt we use in the code) Hope this helps and good luck! 💪
For business apps, I like the approach where the voice is synthetic enough for the user to instantly know they're talking to a computer. Else for real-human sounding voices (like on the NotebookLM level) there ought to be a disclosure. Or, make it a live call, like a contact center call, where it's AI lead but human supervised. Most businesses want that feature anyway, like, the ability to pick up a call mid-voicemail. But yeah, as for the currently available realtime api voices, they seem a bit too theatrical.
Excellent breakdown, I actually don't have much experience with other AI voices and was impressed by these v2 voices from the original 3x v1. I'll checkout NotebookLM to start building a better baseline. and I absolutely love the description of "theatrical" - hit the nail on the head! Thanks for the nuance and recommendation 💪
the only benefit of realtime is it streams audio inputs directly instead of text tokens. So it uses its own voice synthesis. Until they can match eleven labs' voices, theres not much utility. The whole point of this tech for it to sound as close to a natural human as possible. And we gotta wait for prices to drop too. But i'm def gonna be selling this to businesses when it meets these things
Agree with your points, hopefully within the next 3-6 months the ai voice landscape picks up (especially for OpenAI as I love the ecosystem). Keep it up man!
Let me help you ... they all suck for the use case you were trying it on. They all sound like they are reading a book or ..like a voice over for the cartoon ..overly theatrical. This can't be used in any real world context.
Hey Bart, I appreciate the time and effort you put into this and the real-time api videos. I believe that the OpenAI voices are not build really for the customer service use case. Perhaps you could look into retellAI or BlandAi as a sort of continuation of you series on Real-time voice assistance..?! I believe that their voice would be much more suitable for the customer service use cases. Anyway thanks for all the effort. Great stuff!
Thank you legend 🙏 Good suggestion on retellAI - I've been hearing more and more about this and could be a really interesting next step. Thanks for the support my man!
Thanks for covering the new voices Bart. I do like the Sage voice also. I'm testing to use for my business. I have different twilio numbers for each ad so a different prompt relative to that ad. I need to test asking the assistant to recognize the language of the caller and reply in that language back. Lots of Spanish speakers in my area.
My pleasure legend :) One thing I didn't show in my tutorials, but might be useful for you; go to this link: platform.openai.com/docs/api-reference/realtime-client-events/session/update
You'll see the configuration (the code on the RHS) and that is similar to what we set in the previous replit code. One thing I don't set in my code is the "instructions" - which by default are set to something like "Your knowledge cutoff is 2023-10. You are a helpful assistant." It would be good to test passing through custom instructions here too (seperate from the main prompt we use in the code)
Hope this helps and good luck! 💪
For business apps, I like the approach where the voice is synthetic enough for the user to instantly know they're talking to a computer. Else for real-human sounding voices (like on the NotebookLM level) there ought to be a disclosure. Or, make it a live call, like a contact center call, where it's AI lead but human supervised. Most businesses want that feature anyway, like, the ability to pick up a call mid-voicemail. But yeah, as for the currently available realtime api voices, they seem a bit too theatrical.
Excellent breakdown, I actually don't have much experience with other AI voices and was impressed by these v2 voices from the original 3x v1. I'll checkout NotebookLM to start building a better baseline. and I absolutely love the description of "theatrical" - hit the nail on the head! Thanks for the nuance and recommendation 💪
the only benefit of realtime is it streams audio inputs directly instead of text tokens. So it uses its own voice synthesis. Until they can match eleven labs' voices, theres not much utility. The whole point of this tech for it to sound as close to a natural human as possible.
And we gotta wait for prices to drop too. But i'm def gonna be selling this to businesses when it meets these things
Agree with your points, hopefully within the next 3-6 months the ai voice landscape picks up (especially for OpenAI as I love the ecosystem). Keep it up man!
😎
Pozdrowienia z Texasu
Pozdrawiam!
They totally suck unless you’re making a cartoon! 😂
Yeah some are cartoon-y but that's a cool use case! What platform has good voices? I haven't explored much just yet 💪
@@BartSlodyczkaelevenlabs using a custom made voice is probably best. But it's not realtime with OpenAI behind it.
Let me help you ... they all suck for the use case you were trying it on. They all sound like they are reading a book or ..like a voice over for the cartoon ..overly theatrical. This can't be used in any real world context.
Fair enough, can you link voices that you think do a great job so that myself and other viewers can check them out?
@@BartSlodyczka is there no way to use our own voices?