Make an Offline GPT Voice Assistant in Python

JakeEh

มุมมอง 7 270

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 5 ก.ค. 2024
We make our own offline (local) virtual voice assistant using Python that lets you control your computer and ask it anything!
This is yet another great example of how open source software can be incredible for anyone. Without having to rely on any API or sending our data to any servers we can make a pretty solid offline virtual voice assistant for free!
Windows File Path: C:\Users\{username}\.cache\whisper
Mac File Path: /Users/{username}/.cache/whisper
Commands:
curl -o encoder.json openaipublic.blob.core.window...
curl -o vocab.pbe openaipublic.blob.core.window...
GPT4All: gpt4all.io/index.html
Code: github.com/Jalsemgeest/Python...
Join my Discord at / discord
Thanks for watching! ❤️
Timestamps:
0:00 Intro
0:39 Speech Recognition
3:30 Offline Open AI Whisper
12:00 Text to Speech
14:20 Local LLM
23:04 Outtro

ความคิดเห็น • 41

@iyas5398 27 วันที่ผ่านมา
if u had a problem with the vocab file download so basically its vocab.bpe not vocab.pbe u just need to change this in the curl command and it should work just fine
@jakeeh 27 วันที่ผ่านมา
Thanks for the comment!
@ishmeetsingh5553 3 วันที่ผ่านมา
Awesome video! Quick question, when you already used the speech_recognition library, why didn't you use the recognize_whisper method from it and used the whisper library instead?
@joshuashepherd7189 4 หลายเดือนก่อน ⁺¹
Heyo! Awesome Video! Thanks so much for doing this man. So insightful
@jakeeh 4 หลายเดือนก่อน
Appreciate it! Really happy you enjoyed it :)
@EduGuti9000 4 หลายเดือนก่อน ⁺¹
Awesome Video! I am mainly GNU/Linux user and recently I am using also MS Windows, so may be this is a silly question: Are you running that in WSL2? If so, it is easy to use microphone and speakers with Python in WSL2?
@jakeeh 4 หลายเดือนก่อน ⁺¹
Thank you!
I'm running this on Windows. You might need to tinker around on GNU/Linux a bit more to get it working for the microphone input, but it shouldn't be too bad. I've seen a number of cases where linux users were using the microphone input.
Happy coding :)
@mohanpremathilake915 2 หลายเดือนก่อน ⁺¹
Thank you for the great content
@jakeeh 2 หลายเดือนก่อน ⁺¹
Thank you! ❤️/ Jake
@jacklee4691 2 หลายเดือนก่อน ⁺¹
Thanks for the awesome video! just curious, if I want to make the python text-to-speech offline more realistic with model (like in hugging face) is it possible?
@jakeeh 2 หลายเดือนก่อน
Yeah it should be possible! There are some great OLlama models available now too :)
@MyStuffWH 4 หลายเดือนก่อน ⁺²
Just out of interest. Do you have a GPU in your machine (laptop/desktop)? That would give some context to the performance you are getting.
@jakeeh 4 หลายเดือนก่อน ⁺²
Great question! I have an AMD Radeon RX 6800. So certainly not top of the line. Also, in my experience a lot of GPU accelerated things have only worked with NVidia with AMD being a 'TODO' on the developers side :)
@adish233 4 หลายเดือนก่อน ⁺¹
As part of my engineering project , I want to make a similar voice assistant specifically for agriculture which clears farmer's queries and also gives crop suggestions based on the local conditions.Can you please guide me through the project?
@jakeeh 4 หลายเดือนก่อน ⁺⁴
Wow! That sounds like a great project! I'm not sure I could guide you through the project, but you may want to try to find a machine learning model that is more specialized on plants and agriculture. You could even look into making one yourself if you have enough training data! :)
@joannezhu101 2 หลายเดือนก่อน
@@jakeeh I am so curious to know how to train a domain-knowledge only model, that would be brilliant. There must be a way of doing it, I am also learning AI for fun out side of my day job.
@vidadeperros9763 2 หลายเดือนก่อน ⁺¹
Hi Jake. Where do you import pyautogui from?
@jakeeh 2 หลายเดือนก่อน ⁺¹
Hey, you need to install it using pip.
Run “python -m pip install pyautogui” then you can just import pyautogui in your file.
Make sure you use the same python when running your file as you do when you install with pip
@snapfacts41 27 วันที่ผ่านมา
I think gemma could be a better option than this cus i dont think that it would have the token restrictions that gpt4all had, and its pretty easy to install using ollama. even with an integrated gpu from 5 years ago, i was able to get a comfortable experience with the llm model.
@jakeeh 22 วันที่ผ่านมา ⁺¹
Yeah at the time llama wasn’t easily available for windows. There are definitely some better things available now
@joshuashepherd7189 4 หลายเดือนก่อน ⁺¹
4:36 I think its Video RAM - Basically the RAM available on whichever GPU you're using for inference
@jakeeh 4 หลายเดือนก่อน ⁺¹
Yeah I think you're right. Thanks! :)
@wethraccoon9480 2 หลายเดือนก่อน
Please do more advanced versions of this, I am a web dev and would love to start integration my own voice assitance, I'm just a bit newbe to AI
@jakeeh 2 หลายเดือนก่อน
Thanks for the comment! Yeah, I’d be happy to do some more stuff on this. I think the new version would use OLlama. Although I’d like to go over how to train your own model too.
@inout3394 3 หลายเดือนก่อน
Thx
@jakeeh 3 หลายเดือนก่อน
Thanks for your comment!
@lucygelz หลายเดือนก่อน
is this possible on linux and if so can you make a tutorial or link a text guide to something similar
@jakeeh หลายเดือนก่อน ⁺¹
Yes absolutely you should be able to do this on Linux as well. You could take a look at medium.com/@vndee.huynh/build-your-own-voice-assistant-and-run-it-locally-whisper-ollama-bark-c80e6f815cba which uses OLlama as well which is probably a great option nowadays too :)
@lucygelz หลายเดือนก่อน
@@jakeeh thank you
@yashikant5819 3 หลายเดือนก่อน
can you combine it with frontend
@jakeeh 3 หลายเดือนก่อน
You could absolutely make a front end for this so you could interact with it through a GUI and/or voice.
@prabhatadvait6171 2 หลายเดือนก่อน
i'm using linux can you tell me how how to do it in linux
@jakeeh 2 หลายเดือนก่อน
Which part are you having trouble with in Linux? :)
@MyStuffWH 4 หลายเดือนก่อน ⁺¹
It is (very) clear you do not have a technical AI background, but you inspired me to try and make my own local assistant. Thanks!
@jakeeh 4 หลายเดือนก่อน ⁺¹
Oh absolutely! I certainly have a technical background, but I am far away from having too much experience outside of scraping the surface of AI. Happy you felt inspired to give things a shot! You've got to start somewhere! :)
@fnice1971 3 หลายเดือนก่อน
I used LM Studio mostly as it loads multi models, with 3x 24gb GPU's 70GB VRAM you can run like 10 models at same time, more polished then GPT4ALL but both work and free.
@jakeeh 3 หลายเดือนก่อน ⁺²
Oh that sounds great! I certainly don't have those specs, but that does sound great nonetheless.
Thanks for your comment! :)
@joannezhu101 2 หลายเดือนก่อน
@@jakeeh i wonder if it is worth to compare those like Ollama, LM Studio or just the way you've shown in the video, thought i don't quite get how Ollama or LM Studio works (I thought gguf is the only way to work witht local offline method, didn't know what is inside Ollama). Do they really help to speed up things?
@jakeeh 2 หลายเดือนก่อน
I think it really depends on the hardware of your machine. If they can utilize your GPU then they can likely greatly improve the performance. Although I'm not an expert on them :)

ต่อไป

เล่นอัตโนมัติ

Properties & Data Classes | Python Classes