Creating J.A.R.V.I.S.
ฝัง
- เผยแพร่เมื่อ 27 ก.ย. 2024
- A sneak peek of voice-to-voice chat assistant.
🦾 Discord: / discord
☕ Buy me a Coffee: ko-fi.com/prom...
|🔴 Patreon: / promptengineering
💼Consulting: calendly.com/e...
📧 Business Contact: engineerprompt@gmail.com
Become Member: tinyurl.com/y5h...
💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
Signup for Advanced RAG:
tally.so/r/3y9bb0
All Interesting Videos:
Everything LangChain: • LangChain
Everything LLM: • Large Language Models
Everything Midjourney: • MidJourney Tutorials
AI Image Generation: • AI Image Generation Tu...
Wooohooo!! Yeah, can‘t wait for it! ⭐️
it's fast which TTS and STT did you use ?
All openai
Impressive, thanks!
What TTS are you using and is it running locally
Whisper but via the api. Nothing is running locally in this video. Local version will be coming soon.
@@engineerprompt loved it 👍
@@engineerprompt but Whisper is ASR, not TTS??
Gross.
someone already made a fully local version and works w/ little latency and with voice training. there already exist projects on github for continuous speech using a keyword to trigger recording, and a version with a ptt implementation instead of keyword
I don't get it, how's that different from GPT-4o?
You are right, very similar in functionality. In fact, this version is using GPT-4o for text generation. But the voice functionality is not available in GPT-4o yet.
Very interesting project! Do you use any VAD to detect the end of the request?
At the moment no.
Nice!
Wahooo..really looking forward to your new project!
thank you!
Please make beginner friendly tutorial, step by step guide on how to integrate this with localgpt 🙏🙏
yes please is it going open source?
Idk know, why there is a folder on my desktop named Jarvis-v6 since 5 months and surprisingly that's also doing the same job 😮
Would love to see what's in the folder :D I am v0 now
@@engineerprompt it's gonna become interesting. I thought I was the one who was able to crack speech while streaming to reduce the latency.
That doesnt sound like Jarvis, I want the real Jarvis voice!!!
Good point, I think elevanlabs have that. Will try to integrate that :)
@@engineerprompt How about you add a little UI also? And maybe add a button to take continuous screenshot with a regular interval as well. In that way, you will be releasing the OpenAI's demo app before OpenAI.
Is there a way to speed it up?
Yes, Groq has whisper support now. Going with that but the issue is the rate limit!
To use rhasspy3 as a base. It streams audio directly to asr model
how it's different than gpt4o voice?
that is not available yet :)
What apis are being used?
currently everything is openai. Just got access to whisper from Groq, will update it and hope will be much faster!
@@engineerprompt great! Looking forward the tutorial or git repo. Literally yesterday I was searching about Jarvis haha
should edit title to add "using openai"
not local. not the jarvis voice. misleading title. disappointed
Why do you think that is not local? The only bad thing is that he do not use voice streaming for make it faster (I did it so)
Nice but would be great without that annoying 2-3 sec delay.
I agree, I just got access to Groq Whisper. Will be interesting to see how that works.
@@engineerpromptGeorge Hotz on stream called groq a scam...
also i request a video about this vs gpt-4o
Right on Bro, RIGHT ON. ......... but we need the voice of Cortana for this, for when we are sitting around in our Mark V Armor and coding...:)
:)
Great looking forward
thanks
EXCITED!
:)
I LIKE IT GREAT JOB
thank you :)