GPT-4o API: Create Your Own Talking and Listening AI Girlfriend

AI FOR DEVS

มุมมอง 16 410

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 14 พ.ค. 2024
👨‍💻 Learn To Build Real-World AI Solutions ai-for-devs.com
📖 Medium Article: / gpt-4o-api-create-your...
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 54

@trevordupp8734 18 วันที่ผ่านมา ⁺⁷
Thank you for putting this out so quickly! I've been wanting try something with the new release
@ai-for-devs 18 วันที่ผ่านมา ⁺²
Thank you 🙏
@MatthewChowns 18 วันที่ผ่านมา ⁺⁶
Just FYI, they have been consistently using a lower case o in the model name so it's clear it's the letter. Otherwise it looks like it's GPT forty or 4.0, which aren't right.
@ai-for-devs 17 วันที่ผ่านมา ⁺¹
Thank you for the clarification! I'll make sure to consistently use the lowercase "o" in the next videos.
@TheInternalNet 17 วันที่ผ่านมา
Absolutely CRAZY!! This is some really interesting/exciting things.
@ai-for-devs 17 วันที่ผ่านมา
It really is!
@nathanchilds3952 16 วันที่ผ่านมา ⁺¹
Is it possible to save the entire conversation to a local db. Then in between conversations, before starting a new one, the code pulls the previous conversations and feeds that into the prompt first so the “girl friend” has all the previous context(memory) to pull from?
@jichaelmorgan3796 16 วันที่ผ่านมา ⁺¹
Yup, there are some tutorials out there for it
@ai-for-devs 16 วันที่ผ่านมา ⁺¹
There is an easy way to do this by using autogen teachable agents. Please have a look at th-cam.com/video/szYeaUlsaNY/w-d-xo.htmlsi=DhT82kIjkLsxHSiL
@edwardsu7497 18 วันที่ผ่านมา ⁺²
Amazing! Thanks for sharing!
@gutv 18 วันที่ผ่านมา
is there any way to have that magical voice expression as showed in the openai presentation?
@ai-for-devs 18 วันที่ผ่านมา ⁺¹
That's a very good question. On the official page of the text-to-speech guides (platform.openai.com/docs/guides/text-to-speech), it still states the old text: 'There is no direct mechanism to control the emotional output of the audio generated'.
@blasandresayalagarcia3472 17 วันที่ผ่านมา
@@ai-for-devs maybe not emotionally related outputs, but you could implement a system with natural interrupts
@AhmetAKTA-qn8by 16 วันที่ผ่านมา
Great ! , can I make it just answer from spesific own data ?
Answer depending on the Sql table data or something ?
@nathanchilds3952 16 วันที่ผ่านมา
I was thinking the same thing. Like persist all the conversations into a local db table. Then each time the “gf” is instantiated feed it the previous conversations via the prompt so it would have “memory/context”. Granted this could amount to a ton of data over time. Or allow the chat bot a method to go search the local db and use key words on the table data so it would be easier to parse thru. Similar to the AI Car repair bot he made
@ai-for-devs 16 วันที่ผ่านมา ⁺¹
There will be a video next week about using vector stores with gpt4o at ai-for-devs.com
@nathanchilds3952 16 วันที่ผ่านมา
@@ai-for-devs sweet!
@marcc0183 16 วันที่ผ่านมา
good video! One question, can you talk about how to make an app for Android/IOS when they release the API with vision and audio modes? I think that using low code tools like Flutterflow will be easy. an app for mobile devices I think they will be much more effective than for the web
@ai-for-devs 16 วันที่ผ่านมา
Great idea. Let me see.
@arianaponytail 18 วันที่ผ่านมา ⁺³
it will take a few weeks before they release the full new voice system and api aswell as the video function and api. When they do , you should update this project. it will be a lot more imersive. and it should be able to litteraly watch a movie with you and comment on what it sees and have a very detailed emotional voice with very fast response time. :)
@ai-for-devs 17 วันที่ผ่านมา
Thank you for clarifying that, @arianaponytail! I can't wait either. It's exciting to know that the vision capabilities can already be used with the new model and API.
@antonkruger 16 วันที่ผ่านมา
Greater tutorial as always. I can't find the code on your github. Can you provide it pls.
@ai-for-devs 16 วันที่ผ่านมา
Sure, please send an email to sebastian@ai-for-devs.com with your GitHub alias or join www.skool.com/ai-for-devs/about for free access to all courses and source code (only today).
@dr8544 18 วันที่ผ่านมา
hey can we change the voice using by any open source model ?
@ai-for-devs 17 วันที่ผ่านมา ⁺¹
Absolutely, you can easily replace Whisper and the audio API call with any open-source model of your choice, or use a service like Replica, which provides easy access to such models.
@Yewbzee 17 วันที่ผ่านมา
FYI this isn’t using the new conversation mode that they rolled out with 4o, this is the old version.
@ai-for-devs 17 วันที่ผ่านมา ⁺¹
We're using the GPT-4o model for the chat completion endpoint due to its faster inference speed, rather than the GPT-4 or GPT-4 Turbo models. As of now, I don't believe there is a way to use GPT-4o via API directly with sound files. If there is a method, please let me know.
@fabriziocasula 18 วันที่ผ่านมา ⁺¹
thank you very good
@aimademerich 18 วันที่ผ่านมา ⁺¹
Phenomenal
@DezorianGuy 17 วันที่ผ่านมา
Is there a way to create such an ai gf or person with an offline model? With fast response times etc that updates itself with the latest internet knowledge base when needed etc?
@ai-for-devs 17 วันที่ผ่านมา ⁺¹
Creating an offline AI girlfriend or persona with fast response times and high-quality inference is challenging due to the significant hardware requirements.
The computational power needed for real-time processing and maintaining up-to-date knowledge is substantial, making it difficult to achieve the same performance as cloud-based solutions without high-end hardware.
However, you can keep everything local that should be private by using a local vector store for sensitive data. This hybrid approach allows you to balance performance and privacy by running core models offline and updating online as needed.
@DezorianGuy 17 วันที่ผ่านมา
@@ai-for-devs thx. I don't have Internet access all the time so an offline alternative would be awesome. Also the one thing bothering me when using gpt4 is the long response times as it has to check the web. For an ai companion it doesn't necessarily need the most recent news to be socially useful.
Maybe I'm wrong but there seem to be more advantages for an offline ai person.
Faster responses as it doesn't have to check the web (also no queues), versatile use as it can be used anywhere (isn't reliant on constant Internet access), cost effective as there aren't monthly fees, secure as it runs locally, up to date (kinda) with most recent Internet updates integrateable at will, like once every month (topics adjustable to the ai persona), etc.
@Happy_Swede 16 วันที่ผ่านมา
👍 men saknar Xai’s #Grok…
@trevordupp8734 18 วันที่ผ่านมา
How could we say a phrase to end the conversation script?
@ai-for-devs 17 วันที่ผ่านมา
You could add a condition like:
...
print(transcription.text)
if "goodbye" in transcription.text.strip().lower():
print("Goodbye detected, exiting loop.")
break
@trevordupp8734 17 วันที่ผ่านมา
@@ai-for-devs Would I place this at the beginning of the while True: loop?
@chadgtr34 17 วันที่ผ่านมา
can we discuss the recent football match with the ai ?
@ai-for-devs 17 วันที่ผ่านมา
Please do it and send me a screenshot 😂
@chadgtr34 17 วันที่ผ่านมา
@@ai-for-devs i mean, does the ai watch football ? so we can discuss football news with the ai
@dersinier 14 วันที่ผ่านมา ⁺¹
If you use whisper, you dont use the multimodal aspect of GPT 4o. It's not the real deal.
@ai-for-devs 14 วันที่ผ่านมา
Please check the comment and the discussion in the previous comment of arianaponytail
@Shulyaka 13 วันที่ผ่านมา ⁺¹
+1, the title is misleading. I've been clickbaited.
@Cryptowski 17 วันที่ผ่านมา
very good! however, i get this error after the bot has created the output.mp3 on my first question, when i then ask a follow-up question, this error occurs. so it cannot change the output.mp3 once it has been created.
Traceback (most recent call last):
File "F:\BOTS\her-gpt4o\app.py", line 32, in
response.stream_to_file('output.mp3')
File "C:\Users\atzek\.conda\envs\her\lib\site-packages\typing_extensions.py", line 2636, in wrapper
return arg(*args, **kwargs)
File "C:\Users\atzek\.conda\envs\her\lib\site-packages\openai\_legacy_response.py", line 423, in stream_to_file
with open(file, mode="wb") as f:
PermissionError: [Errno 13] Permission denied: 'output.mp3'
@ai-for-devs 17 วันที่ผ่านมา
To avoid this issue, you can create temporary files or change the filenames based on timestamps or an increment. This way, each response gets a unique file name, preventing conflicts.
@Cryptowski 17 วันที่ผ่านมา
@@ai-for-devs i'm not a coder :( how?
@texrwind 14 วันที่ผ่านมา
Can i find free gpt4o key? I want test my code just one time. 20dollar expensive for this😅
@darkman237 17 วันที่ผ่านมา
I'm guessing you have to be a paying customer?
@ai-for-devs 16 วันที่ผ่านมา
Correct
@Powdermonkey99 17 วันที่ผ่านมา
$2.8 Billion in 2023 🤯🤯
@ai-for-devs 16 วันที่ผ่านมา ⁺¹
And this will triple this year
@peterpui7219 18 วันที่ผ่านมา
Any AI boyfriend?
@ai-for-devs 17 วันที่ผ่านมา ⁺¹
AI boyfriends? Absolutely! Just like AI girlfriends, you can create an AI boyfriend with the same tech. He’ll be attentive, always remember anniversaries, and never complain about watching romantic comedies.
@ronaldovilla 5 วันที่ผ่านมา
My friend can i wire$$$$ you for a project consultation ? please help me to create a BRAZILIAN GIRLFRIEND, im a brazilian!!!

ต่อไป

เล่นอัตโนมัติ

Humanity Is Not Ready For These AI Voice Conversations.