I Built a Personal Speech Recognition System for my AI Assistant

The AI Hacker

มุมมอง 259 484

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 21 ก.ค. 2024
This video shows you how to build your own real time speech recognition system with Python and PyTorch. It walks you through the deep learning techniques that are effective when modeling speech problems, as well as code to build your own.
⭐ Play and Experiment With the Latest AI Technologies at grandline.ai ⭐
This video is the second episode of the series "How to build your own A.I. voice assistant with Pytorch"
• Build an AI Voice Assi...
Github:
github.com/LearnedVector/A-Ha...
Pre-Trained ASR Model:
drive.google.com/file/d/1jcNO...
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 302

@totoma3297 3 ปีที่แล้ว ⁺²⁴⁹
this is michael reeves from the universe where he decided to do something useful with his life
@isawcornflakes6201 3 ปีที่แล้ว ⁺⁹
LMFAOOOO DIDNT HAVE TO DO HIM LIKE THAT 😭☝️
@aliveandwellinisrael2507 3 ปีที่แล้ว ⁺⁴
6:57 yep
@UmbraAtrox_ 2 ปีที่แล้ว ⁺³
Wow, that's mean bro.
@justinross2664 2 ปีที่แล้ว
th-cam.com/video/iyl53zyz5zk/w-d-xo.html
@DayoBrandon 2 ปีที่แล้ว ⁺¹
Imagine the greatest Michael colab. The two of them plus Michael Stevens (vsauce)
@zacknawrocki 4 ปีที่แล้ว ⁺⁴⁵
I've been looking forward to this part of the series the most! I've been trying to create/run a voice assistant locally, and could not figure out how to apply speech recognition without relying on Google's Python module (which i was trying to avoid for privacy reasons, defeating the purpose of making one) and the HMM basics in my Intro to AI course weren't enough to implement it. This is fantastic.
@justinross2664 2 ปีที่แล้ว
th-cam.com/video/iyl53zyz5zk/w-d-xo.html
@joeyrivenbark5056 2 ปีที่แล้ว ⁺¹
Hey man, I really like how you have written definitions in addition to your speaking, helps a lot.
@akulgoel9259 ปีที่แล้ว ⁺³
This is so good, I remember seeing this video a year ago and wishing he'd continued the series.
@victor7ultimate 3 ปีที่แล้ว
After watching this video, I literally took off my hat as a mark of respect to this.
Cant thank you enough.
Thanks a million
@OtRatsaphong 2 ปีที่แล้ว ⁺²
Wow, just discovered your channel. Great work. I'm just starting my journey into Deep learning and speech recognition. Will be following your progress.
@Alex.In_Wonderland ปีที่แล้ว
omg, thank you! every other video I look up on this subject is just an ad for a text-speech readers! thanks for going into such detail about your thought process, buut after looking at the rig you have vs the one I've got ... well. . . if it took you a handful of days, it'd take me a week or two LOL great video! thanks a lot!
@fteoOpty64 3 ปีที่แล้ว ⁺⁴
Loved the high speed speech part!. Well done. Excellent production Mike!. TQ
@justinross2664 2 ปีที่แล้ว
th-cam.com/video/iyl53zyz5zk/w-d-xo.html
@neilosborne8682 2 ปีที่แล้ว
This is excellent! (subscribed!)
I had to quickly brush up my skills for a project I'm working on (will be open sourcing it soon!) - and this video was short, sweet and to the point! Thanks
@rahulkumarm1446 3 ปีที่แล้ว
Brooo.I really dont know whether u coded this or just took reference from something....idrc u are AMMMMMAAAAZZZZIIINGGGGG.Hats off 2 u.U have a great talent man.......u could be the next ceo of any big fours too....
@CraftClone1 3 ปีที่แล้ว ⁺²
This is awesome! I wish there was more content from you
One Ai hacker to another, keep on going!
@smeagol92055 3 ปีที่แล้ว ⁺⁸
I'm building my own wearable AI assistant and this series is **exactly** what I was looking for! Great stuff!
@strange5700 3 ปีที่แล้ว
Can you make tutorial
@justinross2664 2 ปีที่แล้ว
th-cam.com/video/iyl53zyz5zk/w-d-xo.html
@PonchoManOG ปีที่แล้ว
dude no way same
@morraza3307 ปีที่แล้ว
@@PonchoManOG does this tutorial still work?
@PonchoManOG ปีที่แล้ว
@@morraza3307 yes
@kalyanstock8058 ปีที่แล้ว
Wow...who knew you can make AI teaching so much fun....You should make more videos
@theroyal1914 3 ปีที่แล้ว ⁺²⁹
we need programmers like you. For advance learning.
@justinross2664 2 ปีที่แล้ว
th-cam.com/video/iyl53zyz5zk/w-d-xo.html
@briankim49 3 ปีที่แล้ว ⁺²
Loved the video. You really showed me the tools I could use to build my own speech recognition model!
@justinross2664 2 ปีที่แล้ว
th-cam.com/video/iyl53zyz5zk/w-d-xo.html
@davidkim2389 3 ปีที่แล้ว ⁺⁴
When next?? Best Series ever!! Please post next!!
@jairojosy5985 3 ปีที่แล้ว
Keep going on and finish the project fast. I'm looking ahead for the project to be finished
@fteoOpty64 3 ปีที่แล้ว
Love your War Machine!. I build my first Pentium Pro Dual Proc decades ago. It had a special powersupply and I had to rig my Generic case to fit the Tyan motherboard!. It ran Linux then.
@benceelmokovacs1422 2 ปีที่แล้ว
Wo hooo!
This thing for FREE?! And help for us how to make it ours?!
This data worth a HUGE amount of money, but you shared it! I'm so much surprised, in the good term!
Thanks, thanks, thanks for it!!
I really want to make an own Virtual Assistant, so big thanks for this video, for the data and for the help!
Be blessed!
@chenjus 3 ปีที่แล้ว ⁺¹
Really dope video. Can't wait to see your next one.
@JoshuaHerath 3 ปีที่แล้ว
This video is so high quality wish you uploaded more
@jtlunsford780 ปีที่แล้ว ⁺¹
Totally awesome. Understood about .5% (that's point 5%). Just got my headset set up in Win 10 and am loving it. You're awesome and I bow to your knowledge and expertise....thanks for the cool vid. It was not wasted on my limited knowledge, but it peaked my interest...thanks again...JT
@kevinrtres 2 ปีที่แล้ว
Thanks for the information. Just goes to show that the idea that we evolved is just sheer madness.
@jumbejolly3129 2 ปีที่แล้ว
Man your a genius man. I wish I could do this. I have some many ideas but dont know where to start.
@sreerajsathish3635 3 ปีที่แล้ว
Omg the video i was looking for thank for making one..... Full support❤
@sirlightshadowslayer473 7 หลายเดือนก่อน
This was insane, gonna try to do similar now, thank you for the informations
@mtaneesh1411 3 ปีที่แล้ว ⁺⁶
This was a really good video dude. Can you tell me how to make the soundwave display that you had while testing the model
@gauravshipurkar1570 2 ปีที่แล้ว
Bro you are freaking awesome!!! i love your content, helps a lot.
@user-jj8qh5lm7u 2 ปีที่แล้ว
i think this is a very good video for me ,It can not only let me learn some knowledge, but also make me feel relaxed.thank you
@chrisw1462 3 ปีที่แล้ว ⁺¹
A Cue Stick - used for playing billiards. Acoustic (a-COO-stick) - dealing with sound or audio energy.
@alexkonopatski429 3 ปีที่แล้ว
this series is so cool! keep it up bro
@swarajshinde3950 4 ปีที่แล้ว ⁺³
Loved it Man , Great Video !
@michealhall7776 3 ปีที่แล้ว ⁺¹
I'm enjoying discovering all these smaller ai channels
@thiscrow 3 ปีที่แล้ว ⁺³¹
at the beginning of the video: Oh I see !
6:57 : Oh I ... oh ...
@JasonTRogers 2 ปีที่แล้ว ⁺¹
Hey Michele, your videos on AI is fantastic! I haven’t seen any videos lately and I am course what you are doing these days?
@adeniyiadeboye3300 4 ปีที่แล้ว ⁺¹
Thanks for this..I am going to thoroughly go through the speech recognition your code on Github
@CreateYourWorld1 3 ปีที่แล้ว ⁺¹
Planning on creating my own Jarvis, this video has given me an insight.
@s1krrpilot 2 ปีที่แล้ว
Same, I'm going to call mind Alfred and integrate it into my helmet
@jasminecheung1998 ปีที่แล้ว
This is a helpful video. I have a question regarding to the audio augmentation. In my project, the test speaker is not in the train data, so my model performers pretty bad on test set,only 50% accuracy. I try to use the pitch shift to agument my train data but doesn't works well. How should I use audio augmentation for this dataset?
@DasToastbrotToast 3 ปีที่แล้ว
Which hardware accelerator are you going to use, if any? As far as I know the net would be quite slow on the Pi itself hence a hardware accelerator like the Intel NCS or Google Coral would be useful, wouldn't it?
@alexandergrayson9856 3 ปีที่แล้ว ⁺¹
Hey pal, your work's great I love it 🙌🙌
@vladiklass1890 3 ปีที่แล้ว ⁺¹
Cool video!!! This will help me a lot with my first NLP project. I wanted to get radio voice data and transcribe it. Any tips on that?
Btw you should come up with a more memorable outro! :D
@vicehaiti914 4 ปีที่แล้ว ⁺³⁷
Keep going bro.full support
@itumelengmothapo2456 3 ปีที่แล้ว
thank you man... this was fun to watch
@notoltrexclearly2690 3 ปีที่แล้ว
so it is possyble to make the virtual assistant write on another command prompt instead of talking, to use it with a custom text to speech AI? would love to see that
@tomhamser7216 3 ปีที่แล้ว
Could you show the code in detail or how I can use it with another model? Could I use a deepspeech model for testingit, too?
@dineshlamarumba4557 3 ปีที่แล้ว
which ASR framework did you used? What is your thought on fairseq wav2vec for this purpose?
@emrehankaraoglu4122 2 ปีที่แล้ว ⁺⁴
This is such a amazing video. Congrats! I am wondering about model deployment part. Are you going to share the coding part of ıweb interface? The sound wave and the text that occurs below the sound wave are awesome.
@adibakhan2865 10 หลายเดือนก่อน
Hey did you found the code for deployment
@muhammadrezahaghiri 3 ปีที่แล้ว ⁺⁷
Can you make a TTS using deep learning? :) I really want to see that.
@SivaShankarsss 3 ปีที่แล้ว
Eagerly waiting
@yashrajhawle4 3 ปีที่แล้ว
Thank you for sharing your knowledge !
@user-mx2rl7gc5k 3 ปีที่แล้ว
Hi i work on a little project with (VOSK asr) and i have an issue , on the first its work good , but on timeline it will be slower on the translating words , im not talking about accuracy but on the time of responding
i thinking about loop of while (Python) mybe overload or somthing , can you help me please ? , thanks
@guidoscalise 2 ปีที่แล้ว
What books/material would you recommend to someone wanting to learn to design models like the one you’re detailing around 7:36?
@shannonsteward4034 ปีที่แล้ว
hi great work I just found your channel great job
@hemanth8195 3 ปีที่แล้ว
This is really nice work dude
@DavidAlvesWeb 3 ปีที่แล้ว
what a great video man, really inspiring! keep up the good work!
PS: you deserve a better t-shirt bro 😅
@itsjustsam04 3 ปีที่แล้ว
Wow I love it! I do have two questions tho. 1 how did you run it in ur Chrome browser. 2 how did u get the cool visual effects for while u were speaking?
@kimkubik7547 2 ปีที่แล้ว ⁺¹
You Michael Rock!!!! Way to teach!!!
@ambeshwari1317 ปีที่แล้ว
bro which python version are you using in this project because in python 3.10.4 it is showing that cannot import textprocess from utils
@tripathi26 4 ปีที่แล้ว
This is awesome! thanks man.
@zikpin 3 ปีที่แล้ว
This is what i was looking for, thanks
@seannam1218 4 ปีที่แล้ว ⁺¹⁰
This is incredibly educational. Thx for sharing ur knowledge for free!
@justinross2664 2 ปีที่แล้ว
th-cam.com/video/iyl53zyz5zk/w-d-xo.html
@himanshuchaturvedi7402 3 ปีที่แล้ว
can we built it for multiple languages like if we speak in french or any other language it will also convert it in text?
@MrWD-ge3eb 3 ปีที่แล้ว
I'm getting an error while installing torch. How to resolve it??
@nathancook8452 2 ปีที่แล้ว
Excellent video, you helped me out tremendously
@elektroprogramming 2 ปีที่แล้ว
what's the different with speech_recognition library that we can use without training the data?
@PritishMishra 3 ปีที่แล้ว ⁺¹¹
Why aren't you uploading more videos? I have already seen this video just came here to say... plzz upload it's been 7 months now!
@mohammadrezakhalilishoja2701 3 ปีที่แล้ว ⁺¹
how did you up-sampled data to create 50 hrs from 1 hr?
@adityagarg6874 4 หลายเดือนก่อน
bro please explain ho did you make that wave with line of your voice
@kellbooby265 3 ปีที่แล้ว ⁺³
Can u make a AI voice assistant for Linux or Windows.
Which we can train
@vijaysoni9182 3 ปีที่แล้ว
Hey, Can I use this to get details from my business software? Can this crawl any software and output the data from it?
@redtako. 8 หลายเดือนก่อน
THIS ONE WAS REALLY FUNNY gj love keep up the uploads :)
@angelgabrielortiz-rodrigue2937 3 ปีที่แล้ว
Wao, great video man. Really awesome stuff
@pranavthakur6744 2 ปีที่แล้ว
Can you make a detailed video how did you manage to make it. I want to learn it.
@UttamDas-ub5ow 3 ปีที่แล้ว ⁺¹
This man is really a hero 👍💓
@tomhamser7216 3 ปีที่แล้ว
Did you have another tutorial as source or how did you do that?
@salimbo4577 3 ปีที่แล้ว
can we use transformes type architecture for audio type data ?
@diegomartin6332 3 ปีที่แล้ว
Please post more videos about this!
@ZpErMy ปีที่แล้ว
Hello, I was fascinated with your Speech Recognition System. I wonder, could your system recognize sung musical notes?
that is, instead of words, musical notation.
@swait239 ปีที่แล้ว
Can I use deep learning to create a voice activated keyboard? If so, can you please provide some insight?
@peacekeepermoe 3 ปีที่แล้ว
Great content dude. I haven't seen anything new for the last 7 months though. Hope you're well :)
@soonapaana24 3 ปีที่แล้ว
You are totally awesome bro...👏👏👏
@maryamnazari1281 11 หลายเดือนก่อน
great job! i want to train a speaker identification project..any ideas where to start?
@rodios-md5du ปีที่แล้ว
You are gold💛
@justinfuruness7954 3 ปีที่แล้ว
Do you have any recommendations for how to learn AI? How long did this take to train?
@Tera2Space ปีที่แล้ว
Hello, when will there be a guide to creating your own speech synthesis?
(TTS)
@NathanaelNewton ปีที่แล้ว
This looks like exactly what I need!
Thanks for posting, I'm gunna follow along and watch tonight.
One question.. Why are you using the auto generated subs on this video 😂😁
@vincebelansky425 9 หลายเดือนก่อน
Thank you for this video and the insight of how to design a voice recognition system independently from the ground up by an newly to AI. Most videos tell you to connect the internet and to a big server by google or someone else. The only question that I have is why use python and not C or C++, especially since you are running a raspberry pi with limited memory and slower CPU and the natural time restraints of real-time speech recognition?
@ahsanbulbul8512 3 ปีที่แล้ว
*Curious;*
Isn't it possible (coding side) to train on TH-cam videos? They also got Subtitles (for verification), random noise and music (for more efficiency) and Unlimited videos to train from.
And does TH-cam legally stop me from doing so?
@AlanJames1987 ปีที่แล้ว
Good video but are you using Linix at 9:31 and Windows at 9:34? I haven't used Windows in a few years so I didn't know you could do this.
@microgamawave 2 ปีที่แล้ว
You can make a video about gait recognition biometrics in python
recognized you from your walk model
@doddianil6946 5 หลายเดือนก่อน
can you explain how to make authorized voice assistant it means voice assistant respond only owner of the device command by identifying his voice how to make that plz explain
@tomgarcia5664 ปีที่แล้ว
Would this work with a language that is not pre-trained?
@muratfazli2422 ปีที่แล้ว
will it work if i take from Common Voice another language pack?
@scarlett_j ปีที่แล้ว
Sorry to inform you, but you pretty much rock, at the same time solved this so I don't have to.
@tilahunanagaw6175 ปีที่แล้ว
what interesting presentation it is!!!
@w3w3w3 ปีที่แล้ว
your videos are great bro! 🤝
@mantiga31 3 ปีที่แล้ว
i got this error when training the data
File "C:\Users\USER\anaconda3\envs\USER\lib\site-packages\pytorch_lightning\trainer\connectors\optimizer_connector.py", line 47, in update_learning_rates
monitor_key = lr_scheduler['monitor']
KeyError: 'monitor'
pytorch_lightning.utilities.exceptions.MisconfigurationException: ReduceLROnPlateau requires returning a dict from configure_optimizers with the keyword monitor=. For example:return {'optimizer': optimizer, 'lr_scheduler': scheduler, 'monitor': 'your_loss'}
Any suggestion on solving this?
@MrDonald911 3 ปีที่แล้ว ⁺⁴
Hey ! I just discovered your channel, nice content ! Your model seems overfitting, I think you should evaluate it on a test data (and not the validation). I would be curious to know how it would perform if you do hyperparameter tuning.
@justinross2664 2 ปีที่แล้ว
th-cam.com/video/iyl53zyz5zk/w-d-xo.html
@hg4lyfe 3 ปีที่แล้ว
Bro this is perfect wow thanks
@stereopsych6381 ปีที่แล้ว
Please upload more!

ต่อไป

เล่นอัตโนมัติ

Large Language Models (LLMs) - Everything You NEED To Know