How to Build Your Own AI Phone Assistant for Just 1¢/Minute (No Cloud, 1 Second Latency)

Bart Slodyczka

มุมมอง 4 929

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 22 ธ.ค. 2024

ความคิดเห็น • 92

@BartSlodyczka หลายเดือนก่อน
📺 WATCH PART 2 - AI Cold Caller With Google Calendar: th-cam.com/video/J3d92Ak-P7o/w-d-xo.html
👉 GET THE CODE FOR FREE: bartslodyczka.gumroad.com/l/zsjdn
🛠 Hire me to build out an EPIC AI Voice Assistant for you: bart@supportlaunchpad.com
🧠 If you are interested in joining my incubator please fill out this form: forms.gle/KJxiqhB3aWxbgGoh8
📋 Take This Quick Survey: forms.gle/otAr1xUamgyYZE5y7
@JhnyBravos 13 วันที่ผ่านมา
Don’t download the free code; it doesn’t work. Don’t anger yourself.
@mmdls602 28 วันที่ผ่านมา ⁺²
Works flawlessly. The peeps mentioning latency -- its most likely your connection. I have consistently achieved sub 1 second, almost realtime performance with this. Nicely done dude. Function calling would be neat; especially crud ops with a db
@BartSlodyczka 28 วันที่ผ่านมา
Noiccee!!!! 💪
@asithakoralage628 29 วันที่ผ่านมา
You’re a legend mate,, great work. I’m learning a lot from your videos.. thanks mate.
@BartSlodyczka 29 วันที่ผ่านมา
Thank you very much 🤝 keep going man 🚀🚀
@aliabassi1 หลายเดือนก่อน ⁺¹
Solid build man amazing job!
@BartSlodyczka 29 วันที่ผ่านมา
Thank you legend :)
@vladimirrumyantsev7445 28 วันที่ผ่านมา
Very nice explanation, love, watching your videos 👍
@BartSlodyczka 28 วันที่ผ่านมา
Thank you 💪
@brentpope1497 28 วันที่ผ่านมา
Yes 11 Labs, definitely!
Also, would love to see how you would implement a script rather than a faq
@BartSlodyczka 27 วันที่ผ่านมา
Script is a solid idea, will do more thinking about this :)
@wordpressobsessed9067 28 วันที่ผ่านมา
Thanks for this video! I've been meaning to set this up with the real time Twilio API, but just haven't gotten to it yet. Been using Vapi but its so expensive. i would like to see how to transfer a call to a real person, or actually book an appointment in a Google calendar. Definitely Eleven Labs integration too!
@BartSlodyczka 28 วันที่ผ่านมา
Great suggestions, google calendar keeps coming up so I will also look into this :)
@malikjaid5163 2 วันที่ผ่านมา
Amazing video
I have one question, why are we using replit, can we deploy it on own servers like ec2 , and what things we need to change if done so.. thankyou
@robertfigueroa425 28 วันที่ผ่านมา
thank you so much.amazing video.i look forward to your other videos. im looking to create super reliable appointment booking ai assistants.i would definitely apppreciate a video on that subject.thank you.
@BartSlodyczka 27 วันที่ผ่านมา ⁺¹
great suggestion my man!
@mastermason หลายเดือนก่อน ⁺²
Awesome! Thank you for sharing this. I have big plans for you.
@BartSlodyczka หลายเดือนก่อน
Shit yeah!! 💪
@erickmarin228 หลายเดือนก่อน
Awesome! Thanks for sharing. I will definitely give it a try
@BartSlodyczka หลายเดือนก่อน
Woot woot! Enjoy :)
@mikew2883 หลายเดือนก่อน
Very cool stuff! Function call would be nice to see. 👍
@BartSlodyczka 29 วันที่ผ่านมา ⁺¹
Thank you and done will pencil this in 💪
@victorvanvas 29 วันที่ผ่านมา
FIRE CONTENT AS USUAL
@BartSlodyczka 29 วันที่ผ่านมา
Thank you Viski 💪
@arnabing 22 วันที่ผ่านมา
This is amazing work! How does this compare in intelligence of the OpenaAI realtime api?
@BartSlodyczka 22 วันที่ผ่านมา
Realtime API is MUCH better and if you can afford it, I would use that. The main reason is because the backend of the realtime api is a built in thread so you’re having a conversation with an “agent” - whereas in this set up we’re sending calls to the completions endpoint along with the entire conversation history. So it’s still very good, but inherently it is not an “agent” (so to speak). For basic calls/ tasks this current set up works great :)
@arnabing 22 วันที่ผ่านมา
@ appreciate that! Also there’s the conversion delay. I wish the realtime was cheaper and had other voices.
@radoslav07 29 วันที่ผ่านมา
Can you interrupt current voice response? Or can you try to finish your thought if you didn’t manage to say it in full and the agent started voice response? Like saying “continue” which will interrupt the response keeping the previous input prompt and allowing you to properly finish input prompt.
I implemented this Command words using Microsoft Azure speech services with continuous voice recognition.
+1 for adding function calling
@BartSlodyczka 29 วันที่ผ่านมา
You can do interruptions and toward the end of my video in the final demo I interrupt and continue speaking about the same topic, and the response was in line with what I was saying. The mechanism that sends API calls to the GPT actually holds all conversation items (user message and agent response) and sends the entire history with each api call, so each response is always contextually correct. I don't know how efficient this process is, but it works for now. And haven't thought about commands just yet, but good idea! And noted on function calling 🙏
@neozys 24 วันที่ผ่านมา
great it works! Can you expand on implementing function calling and eleven labs or cartesia as an alternative for TTS
@BartSlodyczka 22 วันที่ผ่านมา
Awesome! And done will pencil it in 💪
@gurindersingh1713 29 วันที่ผ่านมา
Yes really wanna see function calling like book appointments and transfer calls. Btw isn't it easier to do with livekit?
@BartSlodyczka 29 วันที่ผ่านมา
Good suggestions, will pencil them in 💪 have never used live kit before will check it out :)
@gurindersingh1713 29 วันที่ผ่านมา
@@BartSlodyczka bro you can handle alot with livekit more easily. make sure you check it out. you will thank later, thats how good it is
@emmanuelkolawole6720 28 วันที่ผ่านมา
When I interrupt, the agent stops talking. Is there some kind of bug? I think it has to do with speaker. When I put my phone call on speaker the agent does not reply with audio after the third or fourth interaction. But when I take the phone off speaker it works fine
@BartSlodyczka 28 วันที่ผ่านมา
Hmm, that is strange. When I demo'd the interaction on youtube I had it on speaker and I had multiple conversation turns (so I spoke many times and the ai replied many times). Not really sure what it could be 🙏
@reider340 หลายเดือนก่อน
Hello Bart,
If you were to use deepgram's TTS Streaming service instead of plain REST api calls, wouldn't the response time be faster?
@BartSlodyczka หลายเดือนก่อน
Hey legend, yes you're 100% correct, would be even faster than standard REST api calls. I think using elevenlabs streaming would be faster yet again. So really, there is so much opportunity in this code to have a really fast, really cheap AI Caller 💪
@danielpistola 28 วันที่ผ่านมา
why not use openai's realtime API? just because of the voices, right? please pardon my ignorance
@BartSlodyczka 28 วันที่ผ่านมา ⁺²
I’ve got other videos showing how to do that too 💪 but the realtime api is currently like 30 cents per minute to run, and since it’s still in beta it has some stability issues. But realtime api is very fast and I’m sure all the kinks will be ironed out soon :) great question to ask legend
@danielpistola 28 วันที่ผ่านมา
@@BartSlodyczka
@wawaldekidsfun4850 29 วันที่ผ่านมา ⁺²
Cool tech demo, but let's think twice about automating every customer interaction just because we can. Sure, AI phone systems are cheaper than human staff, but real human connection in customer service is priceless. Personal relationships, genuine empathy, and human judgment are what build lasting customer loyalty. Maybe instead of replacing humans, we should use AI to help them do their jobs better? Sometimes the 'old way' with real people is still the best way, even if it costs more than 1¢ per minute. 🤔 Great tutorial though - the technical implementation is impressive!
@BartSlodyczka 29 วันที่ผ่านมา ⁺¹
Thank you and excellent point, for pretty much my entire journey with ai I have this assumption/ belief that initially businesses will adopt ai to save costs and have faster experiences, but then when everyone uses ai, the question will become “what is actually a good support experience?” And for that I think businesses will revert back to human support. It might not be 100% human, but maybe 50/50 with ai and human. Either way, I still use a 100% human customer support team for my ecommerce brand, but I do give my agents ai tools and augment other parts of our support experience with ai (eg ai chatbot, ai search on our help desk). I agree the tech is cool but we should use it wisely 💪 love the comment, I always want to see this kind of discussion 🤝
@ColdCallSteve 29 วันที่ผ่านมา
I couldn’t find your video where you layout how to use Ai on how to help real humans do their jobs. Any help?
@danielpistola 29 วันที่ผ่านมา
What about the MANY times customer service doesn't give a damn about their job and treat customers as if they were asking for a favor. What about the long waiting times? What about the lack of good manners?
@reserseAI 28 วันที่ผ่านมา
Its priceless when employing “customer service” not lazy employees
@VijiJohn-w3p 26 วันที่ผ่านมา
It's the pareto 80/20 rule. 80% of CS requests are easily manageable and answerable through the various channels (bots, agents, knowledge base etc). It's then augmenting this with the human experience for the 20% of more involved requests of support and service.
@solarexclusivePL 29 วันที่ผ่านมา
Hello Bart! Do you think its possible to create something like this for polish market? But without using Twilio cause their rates are crazy
@BartSlodyczka 28 วันที่ผ่านมา
Siema! I'm not sure what Twilio alternatives work in Poland but you should be able to forward calls from the provider to the Replit code :) And I'm pretty sure you can also change the language to polish - so then you'd have a mega AI Caller 💪
@cryptnyuz6842 28 วันที่ผ่านมา
can this ai agent can also speaks in different languages or just restricted to english only ?
@BartSlodyczka 28 วันที่ผ่านมา ⁺¹
Haven't tested but should be able to speak in different languages!
@mmdls602 28 วันที่ผ่านมา
@@BartSlodyczka Tried it; doesn't come out as good as chatgpt, but it definitely works. I just added a line "you can understand and reply in Punjabi" in the prompt haha. The bottleneck in this pipeline is Deepgram's transcription.
@KasanThe 29 วันที่ผ่านมา
hmm what about using gsm modem for calling - AT commands and you are in home. or use voip gateway. Second thought i was thinking about building same purpose app but my main goals are be independent - selfhosted and do it as 'realistic ' as possible with low latency. Using external api it is to easy, building whole from scratch is a good challange to get to know with whole llm - ai -stuff.
@BartSlodyczka 29 วันที่ผ่านมา
I have heard of people using a local LLM to run the backend and it is possible, fast, and cheap if you did it this way. I haven't looked into this yet but there may be other videos about this online already. As for calling with GSM modem or VOIP, great ideas!
@matt.lehodey 29 วันที่ผ่านมา
Need to figure out how to make that reasoner model that formulates the text think on graph now hmm
@BartSlodyczka 29 วันที่ผ่านมา
Very interesting 🤔
@danielpistola 28 วันที่ผ่านมา
can we do this connecting it to a custom GPT?
@BartSlodyczka 28 วันที่ผ่านมา
Yes you can, but this will be slightly more unstable as the assistants api is in beta (and there are like 5 or 6 api calls per request)
@danielpistola 28 วันที่ผ่านมา
@@BartSlodyczka That makes sense. Thanks a lot for taking the time to respond!
@TheSopk 29 วันที่ผ่านมา
Thanks, what about Deepgram Voice Agent API Real Time?
@BartSlodyczka 29 วันที่ผ่านมา
Haven't thought about this before! Nice suggestion 💪
@bradleyfraser4026 29 วันที่ผ่านมา
I would like to see more the infrastructure side. How to have a small call centre structure
@BartSlodyczka 29 วันที่ผ่านมา
Very interesting suggestion! I will do more thinking about this 💪
@zubairkhankharooti3621 29 วันที่ผ่านมา
hi bart... First of all thankew... Secondly... are you going to extend this video... like adding functions/tools..... that's the main purpose of building these callers.....
@BartSlodyczka 29 วันที่ผ่านมา ⁺¹
Hey legend! Yeah I will make a part 2 video with function calling 💪
@zubairkhankharooti3621 29 วันที่ผ่านมา
@@BartSlodyczka thanks legends Chief...
@emmanuelkolawole6720 29 วันที่ผ่านมา
Outbound agent please? In a way that we can schedule multiple calls one after another, to different customers
@BartSlodyczka 29 วันที่ผ่านมา
Great suggestion, will pencil it in!
@cb4623 29 วันที่ผ่านมา
Function calling booking appoinments
@BartSlodyczka 28 วันที่ผ่านมา ⁺¹
Penciling it in 💪
@digitalsoultech 28 วันที่ผ่านมา
Sorry but how is this 1c per minute? I'd really love to know how you came to that conclusion
@BartSlodyczka 28 วันที่ผ่านมา
I calculated the number or transcription minutes (STT) along with the characters spoken (TTS) via deepgram, then I compared this to the total cost spent via deepgram. This came to ~0.89 Cents (so under 1 Cent). From there I looked at OpenAI API Usage for the same period, which was negligible. So then I decided to just say it was 1 cent total. Hope this makes sense 💪
@micbab-vg2mu หลายเดือนก่อน
thanks:)
@BartSlodyczka หลายเดือนก่อน
Always 🤝
@Dispo-co4po หลายเดือนก่อน
🔥🔥🔥🔥🔥🔥🔥
@BartSlodyczka หลายเดือนก่อน
Letsss goooo 💪💪
@Scienceiscool355 2 วันที่ผ่านมา
Eleven labs plz
@sanjuburkule 29 วันที่ผ่านมา
this is 2s latency. didn't work.
@BartSlodyczka 29 วันที่ผ่านมา ⁺¹
Can be even faster with streaming api for deep gram TTS and even faster with streaming TTS elevenlabs
@zubairkhankharooti3621 29 วันที่ผ่านมา ⁺¹
The problem is in sanju not in the app..
@sanjuburkule 26 วันที่ผ่านมา
@zubairkhankharooti3621 You try it. Let me know if you are able to get 1s latency. Text to speech and speech to text WITH interruption support from India did not work. But I do want it to work. I will retry and post my findings. If it works, then awesomeness 👌
@sanjuburkule 26 วันที่ผ่านมา
@mmdls602 mentioned he tried it and it worked for him. Let me find the fault in my deployment.
@magicaldocs 16 วันที่ผ่านมา
But this definitely has HORRIBLE turn taking, emotion detection and latency ..
Or Am i wrong ? Thats what the secret sauce of Retell, Vapi is :)
@BartSlodyczka 15 วันที่ผ่านมา
Yeah the value prop here is the 1 cent per minute cost, and I agree that other purpose built tools like Retell and Vapi are better at the backend operations of AI calling systems 💪

ต่อไป

เล่นอัตโนมัติ

AI Agents Explained Like You're 5 (Seriously, Easiest Explanation Ever!)