Sign Language Detection using ACTION RECOGNITION with Python | LSTM Deep Learning Model

Nicholas Renotte

มุมมอง 453 150

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 23 ม.ค. 2025

ความคิดเห็น • 1.1K

@BINARAI-q4r 11 หลายเดือนก่อน ⁺²⁴
00:01 This video demonstrates sign language detection using action recognition with Python.
01:40 The video discusses the process of sign language detection using action recognition and LSTM deep learning model.
05:16 MediaPipe Holistic allows us to get key points from face, body, and hands
07:17 Setting up webcam access and rendering frames using OpenCV
11:06 The code captures frames from a webcam and displays them on the screen.
12:46 Setting up MediaPipe Holistics and creating variables for MediaPipe Holistic and MediaPipe Drawing Utilities
16:46 The video explains the process of color conversion in sign language detection.
18:32 The process involves detecting sign language using media pipe and a deep learning model.
21:59 The video discusses the different types of landmarks in sign language detection using action recognition.
23:23 The video explains how to detect and visualize different types of landmarks using MediaPipe.
27:05 The video discusses how landmarks in facial and body pose can be connected to each other.
28:37 Implementing sign language detection using LSTM deep learning model in Python
32:18 Landmarks are drawn and rendered in real time using image pass and cv2
33:55 You can customize the formatting of the dots and connections in Sign Language Detection using a landmark drawing spec and a connection drawing spec.
37:32 Updating pose and hand landmarks with different colors and parameters
39:31 Different models in action: left hand, right hand, face, and pose.
43:09 The code demonstrates how to extract landmark values using pose estimation.
45:04 The video explains how to reshape and convert landmarks into a single array.
48:27 Building a neural network and extracting key points using action recognition with Python
50:10 Setting up error handling and placeholder arrays for pose and face landmarks.
53:52 The video explains how to extract key points for sign language detection using LSTM deep learning model in Python.
55:31 Concatenating pose, face, left hand, and right hand keypoints for sign language detection.
59:11 Using LSTM Deep Learning Model to detect sign language actions
1:00:57 Creating folders to store data for different actions and sequences.
1:04:16 Creates a folder structure for sign language detection using action recognition with Python.
1:05:48 Collecting data using MediaPipe loop and capturing snapshots at each point in time.
1:10:14 The code is outputting text to the screen and taking a break at frame 0.
1:11:44 The first block of code prints starting collection in the middle of the screen and pauses.
1:15:10 The code collects key points by looping through actions, sequences, and frames.
1:16:39 Implementing sign language detection using action recognition with a LSTM deep learning model.
1:20:25 Sign language detection using action recognition with Python
1:23:20 Using MediaPipe to collect key points for sign language detection
1:26:55 Creating a dictionary to map labels to numeric ids
1:29:08 Sequences represent feature data and labels represent y data
1:32:26 Data preprocessing and training and testing partitioning are important steps in sign language detection using LSTM deep learning model.
1:34:12 Training LSTM neural network using TensorFlow and Keras.
1:38:14 The model uses LSTM layers for sign language detection.
1:39:55 The next three layers are dense layers using fully connected neurons.
1:43:16 The video discusses the process of formulating a neural network for sign language detection using action recognition and LSTM deep learning model.
1:44:58 Training the model with 2000 epochs
1:48:11 The training accuracy is high at 93.75% after 173 epochs.
1:49:37 The model has three LSTM layers and dense layers, with a small number of parameters to train.
1:53:13 Reloading a deleted model and evaluating its performance using scikit-learn.
1:55:03 Converting y test and y hat values to matrices and then evaluating the model performance using a confusion matrix and accuracy score.
1:58:18 Implementing prediction logic by concatenating data onto sequence and making detections when 30 frames of data are available.
2:00:30 Implement logic to grab the last 30 sets of key points for generating predictions.
2:04:18 Implementing visualization logic and checking result threshold and sentence length
2:06:45 The code checks if the current action matches the last sentence in the string.
2:10:05 Sign language detection using LSTM deep learning model
2:12:39 The video discusses sign language detection using action recognition with Python using an LSTM deep learning model.
2:17:13 The video discusses sign language detection using action recognition with Python
2:19:14 Sign Language Detection using Action Recognition with Python
2:22:29 To ensure accurate action detection, the last frame needs to be included in the sequence.
2:24:04 The code implementation adds stability by checking if the last 10 frames have the same prediction.
Crafted by Merlin AI.
@girishkemba3865 3 ปีที่แล้ว ⁺²⁵
I remember some time ago requesting this type of video,but to see that its finally here brings me joy. Can't wait to do this and show to my sign language friends.
@NicholasRenotte 3 ปีที่แล้ว ⁺³
I know right, it's taken a while but finally it's here! Thanks for sharing!
@pradeepsaravanan7712 2 หลายเดือนก่อน
did you show to your friends ?
@savi-2084 11 หลายเดือนก่อน ⁺³
I can not thank you enough for all the videos you create i was a noob in tech but the moment i started watching your videos its been a year now and i am so proud of you and myself for coming this far and this project works for me❤
@kanchanpatil9642 2 ปีที่แล้ว ⁺⁸⁸
as someone who is following this in 2023, here's some change(s).....i'll be editing them in as they pop in while I go through the tutorial.
25:42 FACE_CONNECTIONS seems to be renamed/replaced by FACEMESH_TESSELATION.And well since we want just the outlines of the face, it's FACEMESH_CONTOURS that we would need in this project.
@taredje4664 2 ปีที่แล้ว ⁺³
thanks, you save me
@VanderlanAlves7 ปีที่แล้ว
wow! Thank you very much!!!
@interstellarstar3742 ปีที่แล้ว
hey i can't collect data how i save .
@stinger9231 8 หลายเดือนก่อน ⁺¹
Thank You so much, got stuck there for a minute
@simranmehta2778 2 หลายเดือนก่อน
Thankyou soo much
@Stacio6 3 ปีที่แล้ว ⁺³
Hi Nicholas thanks so much !!!! I am creating a model to help deaf people here in my country. Greetings from Guatemala !!!
@NicholasRenotte 3 ปีที่แล้ว
Awesome stuff!!
@aminberjaouitahmaz4121 3 ปีที่แล้ว ⁺¹⁵
Thank you for these clear, practical, straight to the point tutorials! Looking forward to your future videos!
@NicholasRenotte 3 ปีที่แล้ว ⁺²
Cheers @Amin, so pumped you're enjoying them!
@aqsaqamar1634 2 ปีที่แล้ว ⁺¹
@Nicholas Renotte can you tell me why error is coming mp 'mediapipr. Python. Solutions. Holistic' has no attribute 'FACE_CONNECTIONS'
@charank2894 6 หลายเดือนก่อน
@@aqsaqamar1634 replace FACE_CONNECTIONS with FACEMESH_CONTOURS
@yohanessatria2220 3 ปีที่แล้ว ⁺²²
Man, you are so underrated and deserve a lot more! thanks a lot for these awesome learning materials! I have learned a lot from you. Keep inspiring, man :)
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Thanks so much @Yohanes! So glad you're enjoying them 🙏
@rainymatch 8 หลายเดือนก่อน ⁺¹
It's so cool to see how happy Nicholas is when everything works in the end. That's the spirit! Amazing video, thanks a lot for your work man!
@김미소2-z2w ปีที่แล้ว ⁺³
Thank you so much! You are my best teacher in my college life!!!!
@SABEDIT2914 10 หลายเดือนก่อน
Did you made this project?
@engeerdanisme 2 ปีที่แล้ว ⁺²
Thank you @Nicholas Renotte I just passed my capstone project defense utilizing this deep learning model
@tech_voyager 2 หลายเดือนก่อน ⁺³
Dude you are amazing i just almost completed my graduation project!!
@malice112 2 ปีที่แล้ว ⁺¹
Nicholas is the best machine learning youtuber, his tutorials are interesting and fun.
@study_with_thor 3 ปีที่แล้ว ⁺¹⁰
that's amazing! I watched this video more than a month ago but it seemed difficult for me as a beginner. Then I've tried my best to finished Machine Learning/ Deep Learning/ Python / Tensorflow and some Data Science course within a month. Now watching this video again is like watching a movie! it's easy to follow! love it
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
YESSS! That's amazing that you stuck with it, great work man!!
@ruqaiyaali1645 3 ปีที่แล้ว ⁺¹
you finished ML/DL/Python and Data science course within a month!! how is this possible man? I am having a hard time with these courses 🥲
@study_with_thor 3 ปีที่แล้ว ⁺¹
@@ruqaiyaali1645 I think you must be familiar with python code. Make sure practice more than what you learn.
@nguyenvietthai5868 ปีที่แล้ว
@@study_with_thor are you Vietnamese. I see your name. Can you give me some experience please? If so, please respond to me.
Thanks a lot.
@study_with_thor ปีที่แล้ว ⁺¹
@@nguyenvietthai5868 Hi there, please let me know your concerns, I hope that I could help you too.
@Cheese_Academia ปีที่แล้ว ⁺²
Thanks for the amazing tutorials! absolutely life-saving. Just a reminder that the z value from mediapipe is with respect to the wrist landmark not the distance from the camera! I found out pretty late!
@gaddesaishailesh2772 3 ปีที่แล้ว ⁺⁷
I was really waiting for this video!
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
IKR, it's taken a little while hey @Gadde Sai Shailesh!
@Rohan_is_discovering 8 หลายเดือนก่อน ⁺¹
Someone just completed his internship with the help of your code and also got a certificate from an IT company
@danieladama8105 3 ปีที่แล้ว ⁺²⁸
Can’t lie.. I have learnt a lot from Nicholas
@NicholasRenotte 3 ปีที่แล้ว ⁺⁵
My man! Thanks for checking in!
@PIKACHU-zn8fx 10 หลายเดือนก่อน
agreed still learning from him
@giteshpal405 3 หลายเดือนก่อน
did u add more actions of dataset?
@Nikos_prinio 2 ปีที่แล้ว ⁺¹
Hi ! I'm impressed by the amazing clarity of your explanations. For one second I thought you must be a trained teacher robot....
@simranmehta2778 2 หลายเดือนก่อน ⁺³
Just completed this project right now . Feeling extremely motivated to do more projects in future. I made couple of changes in this project such as I used LSTM but the activation function i used is tanh , I used Dropout Layer and l2 regularization to prevent overfitting . At the end I added audio feature so if the model predict 'hello' , hello audio will play.
@danishkhalid2027 2 หลายเดือนก่อน ⁺¹
hey Simran
Did you increased more words? and is your model project detecting sentence level and converting into text and voice?
I want to see your project currently wants to work on it
Would me great 😀
@PraisyJoy 18 วันที่ผ่านมา
Hey simran can you share your git repo
@arpanroy2892 3 ปีที่แล้ว ⁺¹
Your every video slightly edited , directly goes in my cv 🤣🤣🤣🤣 , thanks for taking care of my future ❤❤❤
@NicholasRenotte 3 ปีที่แล้ว ⁺²
Hahaha, build that experience man and go getem!
@leafiadias96 3 ปีที่แล้ว ⁺³
thanks for this amazing tutorial sir , we are working on a project that needed this section and your videos and explanation are being extremely helpful to me and my team ! thanks a lot
@fawwazhameed1104 2 ปีที่แล้ว
Heyy leafia, could you tell me about your project?
@ishaanverma1969 ปีที่แล้ว
This content is so underrated! Thank you so much!
@torstenknodt6866 2 ปีที่แล้ว ⁺⁷
Thanks, great videos. Would be great if you could elaborate into the differences of the used media pipe implementation, compared to the others you mentioned. I mean really a comparison of the underlying models/ networks and their training.
@stevecoxiscool 3 ปีที่แล้ว ⁺²
Great explanation on how to use LSTM with pose coordinates.
@ibrahimalizada381 3 ปีที่แล้ว ⁺³⁵
Hi, Nicholas! These are great video series to watch and learn! Thank you very much!
Can you please prepare a video applying CV on real-time sign language detection on the base of a ready dataset avaliable in Internet?
It may be much more interesting if we can see ViT in action recognition as well.
@VarunAditTheGreat 2 ปีที่แล้ว
Hey, I am trying to build a project with a bigger dataset for ASL. Did you find any dataset?
@asutoshpatro2865 ปีที่แล้ว ⁺¹
@@VarunAditTheGreat i have found it its wlsal data set
did u make pls share the code link
@ruthogadina757 ปีที่แล้ว
i'm learning about this, would you like to work on a project together?
@harrylee97625 3 ปีที่แล้ว ⁺²
Nicholas certainly deserves more views.
@NicholasRenotte 3 ปีที่แล้ว
Awww, thanks @Harry. Much appreciated man!
@dinukii3332 2 ปีที่แล้ว ⁺⁸
Hi Nicholas! Thank you for your tutorial once again. Quick question, How can I change the code to access a folder that contains a dataset of videos without live capturing them? Really appreciate if you could give an answer :)
@NicholasRenotte 2 ปีที่แล้ว ⁺¹
You could loop through each one of the videos by using os.listdir or the tensorflow dataset class then run it through the mp holistic pipeline!
@dinukii3332 2 ปีที่แล้ว
@@NicholasRenotte Thank u:)
@HannahCynthia-mu4ct 2 ปีที่แล้ว ⁺¹
Heyy. Do you know the exact code to loop through video dataset?
@riadhaoufi9452 2 ปีที่แล้ว
@@HannahCynthia-mu4ct i'm looking for it too, i hope he gets to answer up thank you so much for the video brother @Nicholas Renotte
@riadhaoufi9452 2 ปีที่แล้ว ⁺¹
@@NicholasRenotte i'm so lost brother :(
@rabiraj1387 3 ปีที่แล้ว ⁺²
Awaited Video Nicholas hope to complete it and implement on my side.
@NicholasRenotte 3 ปีที่แล้ว
I know, can't believe it's finally out! Let me know how you go @Rabi Raj!
@MuhammadKamran-ow5vp ปีที่แล้ว ⁺⁴
I have a question. Is it possible to feed video of arbitrary lengths (frames) instead of feeding an action of fixed length video? Because in real time, we perform sign language pretty fast and each action is of arbitrary length.
@latestdramas6351 7 หลายเดือนก่อน
Excuse ne
@ibrahimhameem1334 3 ปีที่แล้ว ⁺¹
Super stuff Nicholas! Super grateful for your tutorials 🙌🏻. Keep up the great work!
@NicholasRenotte 3 ปีที่แล้ว
Thanks so much @Ibrahim, soooo much more to come!
@tigre1217 2 ปีที่แล้ว ⁺⁷
Hi nick! Nice tutorial on this sign language recognition program. I had faced some problems of the categorical accuracy staying the same when im trying to add more signs to the model rather than 3 like the ones you used in the video, is there any way to solve this issue? Thanks!
@rryann088 2 ปีที่แล้ว ⁺¹
Hi, are you still working on it?
@labhjoshi3182 2 ปีที่แล้ว
@@rryann088 same question
@yousseffarhan8901 3 ปีที่แล้ว ⁺¹
لا يمكنك أن تتخيل كم ساعدتني. شكرا جزيلا لك 🙏🏼
@NicholasRenotte 3 ปีที่แล้ว
🙏
@pritishmair9577 3 ปีที่แล้ว ⁺⁵
Is there a dataset available for this, which has more signs than these 3. If so it will be really great if someone could share it
@vaibhav607 3 ปีที่แล้ว
Please, can you reply on the status of this?
@ahmedkalair9862 10 หลายเดือนก่อน
@pritishmair9577 did u find it
@giteshpal405 3 หลายเดือนก่อน
did u find any dataset yet?
@predoca46 8 หลายเดือนก่อน ⁺¹
31:06 Im making a project to my school and he's look like your project, and he's function is like your. But, im dont have knowledge sufficient to make this alone. So im watching your video to learn much and complete that, thanks for the video and sorry for my english haha. Send hello to Brazil 🇧🇷 😂
@latestdramas6351 7 หลายเดือนก่อน
Hey
@vasuarora_ 4 หลายเดือนก่อน
@@latestdramas6351school???
@Gabbosauro 3 ปีที่แล้ว ⁺³
Hi Nicholas, I've been working on my thesis project about the quality of body movements and I encountered a problem with keras.
I see that you feed in the first layer a sequence of constant 30 frames (1 second of video/list of mediapipe landmark object).
In my case I have a variable number of frames (i.e. a video containing movements that lasts some 2 seconds (60 frames), some 2.5 seconds (75 frames), some 3 seconds (90 frames), etc., hence with different number of frames), how can I solve this?
I looked around and people say that I can apply the so called "padding and masking" which takes the largest number of frames (longest video) and then add a special value to the others (padding) and after that somehow ignore/filter the special number later (masking). But this can't be applied to my case because I would like to have the freedom of variable number of frames during prediction.
I hope you understand what I want to ask, otherwise let me know and I will try to clarify it as much as I can. Thank you!
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
AFAIK it's the only way to do it, unless you look at something like a sequence to sequence model (I think, don't quote me on that though lol). Padding would be the easiest approach. Set a fixed max length and fill out the frames without detections with a numpy array with zeroes!
@Gabbosauro 3 ปีที่แล้ว
@@NicholasRenotte Thank you for the reply!
Will the padding influence much the classification? I mean if video1 with movementA lasts 3 seconds and video2 with movementA lasts 1 second + 2 seconds of zeroes, would that cause problems during prediction or do you think it will work well?
@NicholasRenotte 3 ปีที่แล้ว
@@Gabbosauro I would prototype and see the impact first. Kinda hard to say without seeing benchmark results.
@Gabbosauro 3 ปีที่แล้ว
@@NicholasRenotte Alright, I'll test it out. Thanks!
@Gabbosauro 3 ปีที่แล้ว
@@NicholasRenotte What I did and it seems starting to do the training is setting input_shape=(None, number_of_features) so time_steps set as "None" instead of 30, and during model.fit() I give it a batch_generator. ( based on this reference: datascience.stackexchange.com/questions/48796/how-to-feed-lstm-with-different-input-array-sizes )
But sadly the accuracy chart doesn't look good, sometimes it is around 40-50%, sometimes it drops to 20%.
@depallyyadaiahgoud750 ปีที่แล้ว
That's way cooler one and your explanation was a ton easier 😉 Thanks Nick
@theethatanuraksoontorn1369 3 ปีที่แล้ว ⁺⁴
Hey Nicholas, I am working on similar project. Just wondering when I test the model using your metric it does not reflect the same accuracy as the real-time test. I train the model accuracy to 80-90% but the real-time test barely capture any sign language. Do you have any thought?
@giteshpal405 2 หลายเดือนก่อน
did u find any solution?
@akshatraj5952 ปีที่แล้ว
Videos that you make is wonderful. Thank you for these practical and clear points in the tutorials.
@usamaejaz5264 ปีที่แล้ว
MP_Data folder missing ha , tou wo kahan se lae gy
@girisathvikavpragatiengine309 ปีที่แล้ว ⁺⁵
Hey Nicholas, the tensorflow version of 2.4.1 is showing an error. It says " Could not find a version that satisfies the requirement tensorflow==2.4.1" please help me out
@alissiazaidi2631 11 หลายเดือนก่อน
hey, did you find the solution ? Actually, I have the same error...
@pareshgupta3288 9 หลายเดือนก่อน ⁺¹
@@alissiazaidi2631 just change the version
if it's winows use:
pip install tensorflow==2.10.0
if linux:
pip install tensorflow==2.16.0
@pavansai2838 4 หลายเดือนก่อน
heyy did find the solution for it?
@siva7702 4 หลายเดือนก่อน
Downgrade your python version to tensorflow 2.4.1 supports only python version 3.6-3.8
@rusticagenerica 2 ปีที่แล้ว
Exceptional tutorial. Thank you from the bottom of my heart.
@rowlandgoddy-worlu3382 2 ปีที่แล้ว ⁺⁵
This is an amazing video! I have learned a lot following your tutorials. One question - What if you are trying to capture actions that are not of equal time duration. E.g if a sign language like "Good Morning" lasts for 5 seconds and another sign like "Welcome" lasts for 9 seconds. How can this be treated?
@032lovishkumar8 9 หลายเดือนก่อน
hey, i am getting error IndexError: list index out of range while running 2:00:10 , how can i resolve it ?
@phoque6 3 ปีที่แล้ว ⁺¹
Thank you for a detailed and wonderful mediapipe tutorial :)
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
So glad you liked it!
@y.yuvraj 2 ปีที่แล้ว ⁺⁵
Hii Nicholas This is really an amazing tutorial I really appreciate it. But I am having an error at fitting the model and it is of 'ValueError' which is "Failed to convert a NumPy array to Tensor". I tried many things but it is not going away so please give me a hand on this.
@another.nikhil 2 ปีที่แล้ว
check the datatype of the inputs in your model. The keras api only accepts numpy arrays.
@fatiha2413 3 ปีที่แล้ว ⁺¹
Hi, Nicholas! I learned a lot from this video! Thank you very much!
@amessit10 3 ปีที่แล้ว
hiii FATIMA , can we implement this project for 26 letters as i am getting error " list index out of range" when trying to do more that 3 actions
@theethatanuraksoontorn1369 3 ปีที่แล้ว ⁺⁶
Hi Nicholas, been working on similar project. I believe this tutorial is done for simplicity so I would like to add a piece of my mind.
When adding more action, the prediction on the realtime will be mix a lot due to frame overlap and wrong slicing of the frame.
I would suggest to show some viz as start and end of the prediction. So the user can follow the start to the end frame.
This way it is similar to the way it is collected and higher prediction accuracy.
@giteshpal405 3 หลายเดือนก่อน
hava u add more actions in your project ?
@T-She-Go 3 ปีที่แล้ว ⁺²
Thank you so much Nicholas 😌 This will help me with my project 🙌🏾
@piresflp 3 ปีที่แล้ว ⁺⁷
Hi Nicholas, thanks for the awesome tutorial! I've got 3 questions about the project, hope you don't mind helping me:
1. When training my model, i've 90%+ accuracy very quickly (150 epochs more or less), but all of sudden it dropped to 30% and kinda stabilized until the rest of the execution, how can I fix it?
2. If I want to add more signs after first training my model, I'll have to re-train it? Or can I train just those specific signs separately? How do I do it?
3. After the model is working just fine, it is possible to attach the real time script to an android app?
@howcircle5530 3 ปีที่แล้ว
i also wanna know about you 3rd quection.🤓
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
1. So accuracy never went back up? Try adding more data for each class depending on what's performing well/not well.
2. You can apply transfer learning, drop the final layer, add a new layer which has the same number of classes as your new signs then retrain
3. Yes, I haven't shown it here as it's probably a whole other video though!
@tigre1217 2 ปีที่แล้ว ⁺²
@@NicholasRenotte Hi Nick, can you elaborate more on the 2nd point? I was quite confused since it is my first deep learning project. Thanks!
@adriamasitoribio ปีที่แล้ว
@@tigre1217 hey! diod you figure it out?
@giteshpal405 3 หลายเดือนก่อน
have u done it?
@soumendas2336 3 ปีที่แล้ว ⁺¹
thank you Nicholas i have learned a lot of things from this video ....that I was looking for the past few months..
@NicholasRenotte 3 ปีที่แล้ว
Yesss! So glad!
@mervesisci4983 3 ปีที่แล้ว ⁺⁵
Hi Nicholas, Thank you for this amazing tutorial. If we use padding in this case (videos containing movements with different number of frames) how can we make predictions in real-time? In the tutorial you set a fixed length (30 frames) (sequence=sequence[:30] if len(sequence)==30), but in my case there are different frame sizes for each activity in real-time prediction.
@abhisekpanigrahi1033 2 ปีที่แล้ว
Hello Nicholas I also have this question. Can you please answer this what if the dimensions are different each time we run
@nnamakah 2 ปีที่แล้ว ⁺¹
Hi Nicholas, thanks for this project, it is incredible. How would you handle video files with varying number of frames? How can I possibly approach the situation?
@matteosacco00 8 หลายเดือนก่อน
Same question, anyone with suggestions?
@masterank2005 10 หลายเดือนก่อน ⁺²
can use the method shown in this video with a little alteration to do the static gestures recognition? Like the single frame hand gesture recognition?
i know there is a video posted about that but i didnt really liked the labelling method shown in tha video of manually labelling each image.
@yashas_hm 2 ปีที่แล้ว ⁺³
Hi Nicolas, Such an amazing video. Helped me a lot building a project. I am working on a different project in which I trained the model with around 20 signs from ASL but I am getting a categorical accuracy of only 0.05 on average in each epoch. can you tell me where I went wrong or anything to imporve it?
@martinposso2098 2 ปีที่แล้ว ⁺¹
hey how you managed to fix that problem?
@muktabhushan7068 2 ปีที่แล้ว ⁺²
hey nick, at 4:12 in your video you got an error how you resolved that coz I am getting the same error
@amitdutta3875 3 ปีที่แล้ว ⁺³
you are great.
@NicholasRenotte 3 ปีที่แล้ว
Thanks so much @Amit!
@mehmety5012 2 ปีที่แล้ว
Great Tutorial Nicholas. Thank you so much !
@angelortiz3564 3 ปีที่แล้ว ⁺⁴
This is so awesome! You can theoretically do the same for the static letters in the ASL alphabet, right? Just make the dataset that contains each hand sign. The model would be train on the keypoints of each hand sign.
Although I am not sure it for some hand sign letters, the keypoints would be accurate. What do you think?
@anshumanchoudhary4732 ปีที่แล้ว
That model would be far more easier to achieve
@eswar7781 11 หลายเดือนก่อน
@@anshumanchoudhary4732which model
@gustavojuantorena 3 ปีที่แล้ว ⁺¹
Wow! This is great Nick! 👏👏👏
@NicholasRenotte 3 ปีที่แล้ว
Thanks a bunch Gustavo!!
@mahmudanajnin9367 3 ปีที่แล้ว ⁺³
hey nick! this project is amazing! thank you for these awesome tutorials. You did sign language detection with tensorflow object detection which detects sign using single frame but here we're using multiple frames to detect it. So i was wondering how is this one better than tensorflow object detection?
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Just depends on the use case, the OD model does it on a single frame, this does it for multiple frames (this one is better for signs with multiple phases)
@barithiachudhan3034 3 ปีที่แล้ว ⁺¹
Hai nicholas it was such a wonderful implementation and thanks for sharing it with us
@NicholasRenotte 3 ปีที่แล้ว
So glad you enjoyed it!!
@estebanpozo8702 3 ปีที่แล้ว ⁺⁵
Hi Nicholas, thanks again for this great tutorial! I am writing this because I would like to learn more about how did you chose your architecture. As you mention, almost all the state-of-the-art papers use a combination of CNN and LSTM. So, I have two questions
1. Would it be possible to get a more detailed explanation on how you build this model?
2. So far, I have reviewed “LSTM: A Search Space Odyssey” by Greff (+ other papers) and the “Neural Network design” handbook by Hagan. Could you recommend me any documents regarding LSTM architectures?
@NicholasRenotte 3 ปีที่แล้ว ⁺²
This is how I normally build stuff:
1. Find a research paper that has implemented a similar model
2. Try building the code for that model
3. Fine tune and iterate (a lot) to get solid performance
I wish I had a standard process but it is hyper iterative.
@estebanpozo8702 3 ปีที่แล้ว
@@NicholasRenotte thanks! :)
@ambikarauta3966 4 วันที่ผ่านมา
after running the code my camera isn't opening up...what should i do 11:06
@akshith.vbharadwaj2269 2 ปีที่แล้ว ⁺³
Greetings
Hey man this is an awesome tutorial and I completely love the way u have explained the process step by step.
It was an awesome tutorial and I completely loved it. I tried it on my own and I have encountered some problems it would be a great help if u could help me out with it.
I have followed the same method that u have prescribed on the video these are the problems which came up.
Even after getting overall categorical accuracy 95% and above accuracy on training datasets when I do the gesture recognition it is not recognising one gesture.
And sometimes it shows the same gesture even though I am showing a different gesture.
Sometimes even it is detecting 2 gestures even though I am not giving any gestures.
I am always retraining the same data to get a higher accuracy before going to the gesture recognition part.
I have also increased a layer in the LSTM model but the results are the same.
Would greatly appreciate the help.
@NicholasRenotte 2 ปีที่แล้ว ⁺¹
Start with the data, I would add more data of the underperforming classes then retrain. Remember bad data in will lead to bad outputs and vice versa, try adding 20-30 more samples for each underperforming class and give it a go!
@khaledchikh90 ปีที่แล้ว ⁺²
What if we have length variant; eg: each video contains several frames ( not all equal to 30, sometimes video contains 10 frames, another video would have 50 frames )
@knd3846 2 ปีที่แล้ว ⁺¹⁸
hi .. first of all thanks for your free code to this brilliant work. second, i am a beginner in using python yet i have come too far in running your code. At step 11 i am facing an error that keeps appearing and i am exhausted right now bcz i have spend my whole day in finding a perfecct solution for it. it keeps showing TypeError: only size-1 arrays can be converted to Python scalars. after running plt. imshow coding line ..... plz plz need help...
@xboxgaming4307 2 ปีที่แล้ว ⁺¹
Facing same issue .. even i follow all of the same steps ... srsly i need help too
@safamunir1510 2 ปีที่แล้ว
I'm having same issue in the coding ... please help us removing this error
@harryfeng4199 2 ปีที่แล้ว ⁺¹
did u manage to figure it out?
@knd3846 2 ปีที่แล้ว
@@harryfeng4199 nope.. I have tried so mnay different things but its all in vane.. I am at my last step though..
@sowmyacheguri21 2 ปีที่แล้ว
Hey! Did u figure it out?
@jwknight 3 ปีที่แล้ว ⁺¹
I really hope you continue this project.
@NicholasRenotte 3 ปีที่แล้ว
I don't think I'm ever going to give this one up until I truly nail it. I feel like we're maybe 50ish percent of the way there. Still a TON of work to do.
@jwknight 3 ปีที่แล้ว ⁺¹
@@NicholasRenotte I know it requires a lot of data and work to do. Also a project like this that helps people is always a great thing to be working on. I'm glad to see you sticking to it. I really wish SignAll would just release their product instead of making it about money. Their database has I have heard over 300,000+ sign language hand symbol videos labelled. I guess businesses and schools can request the software. But I just know they won't let just anyone touch it otherwise. That just really depresses me to know. I have a cousin that I can never understand when he comes over yet he understands me due to his hearing aid implant. It just sucks... and I think the world needs a solution that's not locked away.
@jwknight 3 ปีที่แล้ว ⁺¹
@@NicholasRenotte Try requesting data from How2Sign's Github 16,000 vocabulary words (srvk
/how2-dataset). just be sure to read their licensing terms before requesting it though if you do. Sorry I don't know many good resources I just want to see the project flourish.
@WJ-zq3xo 3 ปีที่แล้ว ⁺⁴
Great tutorial as usual, Nick! Learning a lot from you :D
Did anyone try to use a set of videos instead of recording their own videos? If yes, what did you change in the code base?
Kudos
@shrirampareek 3 ปีที่แล้ว ⁺¹
Hey! I used some set of videos(26) and was able to get 92% on test dataset however when I tried doing the same gestures using webcam, I get same 4 classes all the time
@amessit10 3 ปีที่แล้ว ⁺¹
@@shrirampareek can we implement this project for 26 letters as i am getting error " list index out of range" when trying to do more that 3 actions
@neerajpatil7850 2 ปีที่แล้ว
@@amessit10 Same for me ! Have you figured out the why the error ?
@amessit10 2 ปีที่แล้ว
@@neerajpatil7850 No man, i closed this project coz i only need hand gestures not full body keypoints
@amessit10 2 ปีที่แล้ว
hands occludes , so recognition fails
@shrutipatchigolla9017 3 ปีที่แล้ว ⁺¹
Even though I have followed the code exactly I am not getting the pop up once I run the code till 12:26
I run the cell and my webcam light switches on for a second but there is no pop up.
I am using Ubuntu if that matters.
@NicholasRenotte 3 ปีที่แล้ว
Try running the cell again it might take a few false goes!
@shrutipatchigolla9017 3 ปีที่แล้ว
@@NicholasRenotte I have been trying for the past 2 days now,I'm not sure what's wrong
@bhavyasachdeva10 2 ปีที่แล้ว ⁺¹
Hey! I am doing this in my Vscode and please help me how to train my model? It is not creating a training popup at this point in my system 1:06:00 - 5. Collect Keypoint Sequences
@T-She-Go 3 ปีที่แล้ว ⁺¹⁰
Update: I managed to get an accuracy of 98% by changing the activation functions of the LSTM and Dense layers. 😌 Hope that this helps y'all who might be stuck on this
Hi Nic 😌 me again 😅
So I'm trying to use a new data set of gestures and I can't seem to get an accuracy >20%. I have tried to change the learning rate, the optimiser, etc, but non of these work 🙈 Is there something that I am missing?
Thank you in advance 🌸
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
How many gestures and how many classes? For really similar classes I'd suggest adding way more data in order to produce a more accurate model. Also, what activations did you change, curious?
@T-She-Go 3 ปีที่แล้ว
@@NicholasRenotte I used 5 gestures, 2 were based on hand movements and 3 were based on head movements. I think I should've added more data because the prediction model could not tell the difference between all the head gestures x_x
Also, I changed the ReLu activations to Sigmoid
@mahmudanajnin9367 3 ปีที่แล้ว
thank you so much..using sigmoid function really worked for me!
@T-She-Go 3 ปีที่แล้ว ⁺¹
@@mahmudanajnin9367 Yaaay :) I'm glad
@mahmudanajnin9367 3 ปีที่แล้ว
@@T-She-Go can you tell me how to find out how many labels the confusion matrix is for?? i have 5 classes in my project and yhat = [1, 0, 1, 1, 2, 0, 1, 0, 4, 3]. My confusion matrix gives 5set of arrays..I'm really confused. Is it related to yhat value?
@redabenlekehal7271 3 ปีที่แล้ว ⁺¹
Brilliant as expected
@idkidk1774 3 ปีที่แล้ว ⁺¹⁰
finally it worked
@idkidk1774 3 ปีที่แล้ว ⁺¹
Sir how to increase accuracy
@mrmoody915 2 ปีที่แล้ว
@@idkidk1774 create a for loop that trains the model each time it then checks accuracy and if it is higher than the previous highest accuracy the model is saved and the new highest accurancy is set
@mrmoody915 2 ปีที่แล้ว ⁺¹
@@idkidk1774 also just increase data sets
@aqsaqamar1634 2 ปีที่แล้ว
@@mrmoody915 can you please solve my error
@mrmoody915 2 ปีที่แล้ว
@@aqsaqamar1634 which is
@udoysaha3086 9 หลายเดือนก่อน
Helped a lot.. Everything explained really well.. Thank you so much!
@jeanpierrebravomendoza6470 10 หลายเดือนก่อน ⁺¹¹
I'm deaf help
@satyaranjansahoo8431 9 หลายเดือนก่อน ⁺³
Use caption
@namanmishra9072 2 ปีที่แล้ว ⁺¹
i am facing a problem. on the part 1:32:11 ( X = np.array(sequences) )
its saying memory error.
Is there a way to decrease memory consumption or an alternative ?
@NicholasRenotte 2 ปีที่แล้ว
Try running on a diff machine?
@MuhammadKamran-ow5vp ปีที่แล้ว
It was really a great tutorial on real time sign language detection.
@lesterhsu 2 หลายเดือนก่อน
Love this video. I can't believe that i just completed it.
@andy_rocky 2 หลายเดือนก่อน
hey can you please help me with this
@lesterhsu 2 หลายเดือนก่อน
⁠@@andy_rockySure. What's the problem you're having? The dependencies I use conda create python 3.12.5 tensorflow 2.18.0 mediapipe 0.10.14. Depending on the code in the video and popular reviews, you can completely do the same effect as in the video (the downloaded code may differ slightly from the one presented in the video).
@anirudhxmishra หลายเดือนก่อน ⁺¹
@@lesterhsu I'm having a problem with my dependencies, and I'm unable to run this on Jupyter Notebook.
What's your GitHub?
@shantanukumar6074 ปีที่แล้ว ⁺²
Can you please help !!
While running the very first line i.e.
Importing the modules.... I'm getting error which says cannot find a version that satisfies the requirement tensorflow==2.4.1.
What should I do now? Plzz help
@Sutirtha 3 ปีที่แล้ว ⁺²
Thank you so much for the video,
The x,y,z values changes based on the position of the person and camera, how can we transform the key points so that irrespective of moving ourselves, the relative body coordinates remain with respect to movement in camera?
@NicholasRenotte 3 ปีที่แล้ว
I'm not sure I follow, the keypoints will always be different as they're tracking the person (if the person moves, so do they keypoints). Got a use case for me?
@hamednasr3078 2 ปีที่แล้ว ⁺²
I wish you recorded all your videos with zoom and font size of 22:30, it is really great 🙂
@NicholasRenotte 2 ปีที่แล้ว ⁺¹
Yeah I've gotta work out how to do it, I just can't code with that amount of zoom though @Hamed. Will see what I can do!
@MrNewtonJable 3 ปีที่แล้ว ⁺¹
@ 12:31 has anyone else had trouble with not opening the Window displaying the camera?
i have no errors but it just does not open.
@ronakdubey581 ปีที่แล้ว ⁺¹
Thanks a lot for this man code seems to be working fine with little changes , I have even added a speach function which will speak out the predictions works preety well
@VanderlanAlves7 ปีที่แล้ว
how did you do that? I want to do it too but I am a begginer
@unnathi8796 10 หลายเดือนก่อน
@@VanderlanAlves7 did you do it? how to do it?
@TheDreamsandTears 7 หลายเดือนก่อน
How did you do it?
@TheDreamsandTears 7 หลายเดือนก่อน
Can you share?
@LucasEloi 3 ปีที่แล้ว ⁺²
Nice work, thank you for the wonderful video!
@NicholasRenotte 3 ปีที่แล้ว
Cheers @Lucas!
@study_with_thor 3 ปีที่แล้ว ⁺²
On Window, I can't close when open the webcam, are there any ways to fix?
@NicholasRenotte 3 ปีที่แล้ว
Hit Q on the keyboard?
@study_with_thor 3 ปีที่แล้ว
@@NicholasRenotte Thanks my bro, that's amazing. By the way, is it possible if I use Pycharm or Visual studio code to do this instead of jupyter?
@NicholasRenotte 3 ปีที่แล้ว
@@study_with_thor sure can, just need to replace the references to command line installs (e.g. cells beginning with !) and do them at the prompt instead
@tde13 2 ปีที่แล้ว ⁺¹
hello, I have got a few questions:
1. is it required to rerun the whole thing every time I open jupyter notebook, right from step 1?...what do I do for not doing the same
2. how do I increase the accuracy of the model?
3. for including more actions do we need to start all over again from step?
@ashurroganathan8632 3 ปีที่แล้ว ⁺¹
Always Great Videos :). I have learned many Things from you. Thx
@NicholasRenotte 3 ปีที่แล้ว
Thanks so much @Ashur! So glad you're enjoying it!
@sahanahiremath8945 2 ปีที่แล้ว ⁺¹
This helped me sooo muchhhh! Thanks.
@dantealonso7174 3 ปีที่แล้ว ⁺²
Thanks a lot for this content, I've been learning a lot, you are a god :)
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Keep on learning my guy! Love that you're smashing them!
@meetvardoriya2550 3 ปีที่แล้ว ⁺²
Another biggeeeeeee on the heap!,amazing sir❤️🙏
@NicholasRenotte 3 ปีที่แล้ว
YESSS! The big videos are quickly becoming my fav to make, lmk what you think @Meet Vardoriya!
@matts2581 ปีที่แล้ว
Excellent instruction! TY very much for sharing! :)
@ZohaJaved795 8 หลายเดือนก่อน
I hope I get reply soon
I'm running this model now at 1:48:10 if the categorial accuracy is not improving and loss are more than its suppose to be what shall I do to improve my model?
@bumenmangu4019 8 หลายเดือนก่อน
can u tell me which specification dependencies you are using
@alinaandreeva8554 10 หลายเดือนก่อน
Thank you for the video! I have a question though: I have trained the model and I am trying to make detection on prerecorded videos. Each video contain multiple actions (performed sequentially). However, the model only gives one detection (one action) per video. Is it possible to solve this issue?
@sazidshaik4577 3 ปีที่แล้ว ⁺¹
Thanks For Considering My Comments And Did with LSTM Love You and Really Good
@NicholasRenotte 3 ปีที่แล้ว
Anytime, it was a long time coming but it's here!!
@catslave8199 4 หลายเดือนก่อน
Thank you so much, I have a question, what if there are gestures that take more than 30 frames to complete? How to adjust to model layer and how to process those frames for predictions? The sliding window cannot be fixed-size at 30 frames for a prediction now
@OsazeOgedegbe ปีที่แล้ว ⁺²
Hello Nicholas, I really enjoyed this tutorial. I wanted to ask if there was a way to normalize the x, y and z coordinates to they are not dependent on their position in the frame.
@MilenaReimann 2 หลายเดือนก่อน ⁺¹
Where does the programm run? Do you use anything else than your normal processor (e.g. TPU)?
@AnhLe-hc8qm 3 ปีที่แล้ว ⁺¹
most useful video i've seen
@NicholasRenotte 3 ปีที่แล้ว
Oh thank you SOOOOO much! So glad you liked it!
@ericklasco 5 หลายเดือนก่อน
2024 and this is still useful, thank you Nicholas👍
@anirbansaha244 4 หลายเดือนก่อน
hey did you complete the project?
@ericklasco 4 หลายเดือนก่อน
@@anirbansaha244 yes
@whisplay 4 หลายเดือนก่อน
@@ericklasco can you please share which version dependence you used for importing
@siva7702 4 หลายเดือนก่อน
Please provide versions bro, like what is python version you used
@whisplay 4 หลายเดือนก่อน
@@siva7702
Python 3.12.5
and
import versions:
!pip install tensorflow opencv-python mediapipe scikit-learn matplotlib
!pip install --upgrade mediapipe
note: use jupyter notebook! with this code camera can't be accessed on google colab or kaggle like ide's, if you want to use colab you need to add an additional Javascript code.
@abhishripatil791 3 ปีที่แล้ว
Thank you for this this helped me so much with my project esp making the dataset
@lincoln169 3 ปีที่แล้ว ⁺¹
I love your videos Nicholas 🙂💙
@NicholasRenotte 3 ปีที่แล้ว
Thanks a bunch!
@entertain7 8 หลายเดือนก่อน ⁺²
Thanks for this amazing tutorial
I have a question, how do we create for the reverse ..... which means from text to sign language translator
@TheDreamsandTears 7 หลายเดือนก่อน
I want to know that too!!
@chamangupta4624 3 ปีที่แล้ว ⁺¹
Very good prjoect , very well implemented ,

ต่อไป

เล่นอัตโนมัติ

Transformers (how LLMs work) explained visually | DL5