These from scratch videos & paper implementations take a lot of time for me to do, if you want to see me make more of these types of videos: please crush that like button and subscribe and I'll do it :) Support the channel ❤️: th-cam.com/channels/kzW5JSFwvKRjXABI-UTAkQ.htmljoin Original paper: arxiv.org/abs/1905.11946 Paper review: th-cam.com/video/_OZsGQHB41s/w-d-xo.html Code: bit.ly/2ORFxR9 Timestamps: 0:00 - Introduction 0:45 - Imports 1:00 - Architecture config 6:10 - Implementation Structure 7:10 - CNNBlock 10:10 - SqueezeExcitation 13:05 - InvertedResidualBlock (w. Stocasthic depth) 23:44 - EfficientNet 36:22 - Running a small test case 37:55 - Ending
Great! Thanks a lot! I was trying to implemet EfficientNet myself, but three days in a row stacked with different problems. Then had a look at your video and took some tips and trick. Now it's working! Awesome!
@@AladdinPersson at line 62, the kernel size of the expand_conv should be 1 instead of 3 because we're trying to increase the #channels using 1x1 Conv github.com/lukemelas/EfficientNet-PyTorch/blob/7e8b0d312162f335785fb5dcfa1df29a75a1783a/efficientnet_pytorch/model.py
I'm super impressed with the pace at which you produce these tutorials... How much time goes into creating a video like this? Do you get it right with the first try? Amazing content!
It's quite varied, this one in particular was quite smooth and took me less than a day to go through the paper and implement it. Other implementations I've done, like the Yolo from scratch tutorial took me over a week
I was waiting for this implementation!! thanks, I would really like a video where you try to make a very good model for a computer vision Kaggle competitions.
@@AladdinPersson to be honest there are object detection github models and we can just clone it and apply to the data; but as in kaggle we have to improve it. So i was thinking how would you approach an object detection competition and use different methods to get a higher score.
@@AladdinPerssondon't have any names in particular but the ones which allows you to showcase good amount of feature extraction and tricks in training. Also, a monthly one competition video or whatever that suits your routine on a long run would be a really informative playlist resource. Cheers.
Aladdin!! Awesome video! Very well explained, clear. Helped me understand network details that could not get from the paper video. BTW the paper video is also very helpful. :) Thank you! It would be amazing to have similar content on DenseNet
You just see the refined version when you're watching the video. There's a lot of dumb mistakes, thing not working, me feeling like I'm never going to understand it behind every video. Stick with it whatever it is you're working on and you will improve 👊
Been watching videos from your channel for a while now, the concepts and implementations are very well explained!! Thanks for the amazing content! Is it possible to go through efficientdet as well (the efficientnet variant for object detection task) or DETR?
I think I want to do some more Kaggle a bit but will get back to those, which of those two do you think would be most interesting, EfficientDet or DETR? I think the idea of DETR is cool but the performance is not great
@@AladdinPersson yeah I agree maybe DETR would be more interesting I think integrating transformers into computer vision has become quite popular recently
Hi Aladdin, Thank you! It is very helpful. I wanted to request if you could please do a similar video on Retina Net Algorithm? It does perform pretty well when compared to YOLO for vision systems, but there is not much out there on it. For a beginner, it will be very helpful! I appreciate the time and effort you put into these videos.
Bro when I was watching your awesome videos about Machine Learning i thought you were a invisible machine learning God, but now I finally see you have a face and are human :-D
thanks for the good video. can i ask something, in list of base_model, how you know the stride, for it was confusing since never explained in paper?. or this has calculation to determine the stride or other. thanks
Good video! But why do you use conv in Squeeze and Excitation? As far as I'm concerned they only use convolutions instead of fully-connected as a NoSqueeze block, which performs worse than the SeBlock.
hello, cool video. I have a question, how do I implement a model in EfficientNET for depth estimation. I already have my dataset. I just need to train the model. How do I do it using this? Thank You
Amazing video! THanks :) I would actually love to see an implementation video of the recent Nerf Paper: HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling Cheers and keep the good videos coming! :)👏
While working on your videos, I try to take notes on paper, while trying to understand the code. I'm having a hard time, but it's still fun. I'm a cs student. Is it normal that I don't have a hard time? Your from strach videos, especially on the efficientNet architecture, really challenge me. Do I have a deficiency or are these already difficult? edit: I understand your videos and take notes of what you do. It's just that the process is difficult.
@Aladdin Persson Great video! I just have a small question. Can you please demonstrate how to calculate the actual model architecture for bigger models (B1,B2,etc.)? This bit is not clear (atleast not to me) and the newer paper also uses similar techniques, so it would be great to know. Other than that, awesome video! I always learn so much watching these videos.
Hey Aladdin great video, I just want to ask that I wrote your code and check the summary of total param size and I found that (around)14 million for b0. The paper is showing 5 million, do I miss something? Thank you great work 👍
thank you for creating such a great content, I am working on a 3D medical image classification, how do i change 2D efficientnet to 3D efficientnet? which parameters should i change? thank you.
Great video! One question though: you said that a default 3x3 convolution is in fact a cube and that it takes 3 channels into account. But isn't it actually 3x3x(number of channels)? That's why the default value for groups is 1, which means that all channels are in that one group and are being taken into account.
Hello,,your videos have been really helpful for me. As per my request,you made the object detection playlist and bangggg!!!!!,,,it was of great use to me. Now here i am stuck in another problem.I am learning tensorflow and ofcourse following your tutorials,but I am stuck in something.Can you please put up a video tutorial on how to do data augmentation on a tf.data dataset consisting of images and its respective masks?
I have a question Why are you implementing papers using pytorch not tensorflow ? Can you tell us the differences between both of them ? Another thing, there is a lite version of pytorch like the tensorflow one ? Thank you
I am more productive in Pytorch and things seem to make more sense. Personally I don't see a benefit to me using Tensorflow for now. I feel that we're seeing a trend of people moving away from Tensorflow, internally Google has started the JAX project which seems to diverge from Tensorflow. OpenAI recently said that they are going to only use Pytorch moving forward. There are things lacking with regards to deployment with Pytorch but I think they are working on it with torchserve etc, with regards to TensorFlow lite I don't think there's an equivalent in Pytorch
Hey man great work but i found one minor bug torch.randn() in stochastic_depth() sometimes returns numbers less than 0 or more than 1, this might be a problem so try using torch.FloatTensor(x.shape[0], 1, 1, 1).uniform_(0, 1) instead, as it stays in [0, 1] range hope this helps
Great video.Can you implement code for efficient as a encoder and UNet as a Decode for the following paper "A robust framework for glaucoma detection using CLAHE and EfcientNet
I hope you will see my comment. can you please implement EfficientDet model? I know maybe it's long but please try to implement it in a video because I want so much. And thanks for that great content.
These from scratch videos & paper implementations take a lot of time for me to do, if you want to see me make more of these types of videos: please crush that like button and subscribe and I'll do it :)
Support the channel ❤️:
th-cam.com/channels/kzW5JSFwvKRjXABI-UTAkQ.htmljoin
Original paper: arxiv.org/abs/1905.11946
Paper review: th-cam.com/video/_OZsGQHB41s/w-d-xo.html
Code: bit.ly/2ORFxR9
Timestamps:
0:00 - Introduction
0:45 - Imports
1:00 - Architecture config
6:10 - Implementation Structure
7:10 - CNNBlock
10:10 - SqueezeExcitation
13:05 - InvertedResidualBlock (w. Stocasthic depth)
23:44 - EfficientNet
36:22 - Running a small test case
37:55 - Ending
Masterpiece
Awesome channel, wish I found this sooner!
I've watched your videos for a while! I think they're super chill & a really nice way to get an overview of a paper
The kind words means a lot coming from you :)
Thanks for your amazing work really ! It helps me a lot !
Great! Thanks a lot!
I was trying to implemet EfficientNet myself, but three days in a row stacked with different problems.
Then had a look at your video and took some tips and trick. Now it's working! Awesome!
Good video! I guess this video didn't include dataloading, but it would be great to see how that's done, as well as saving the model after training.
It is a really really great video for the deep learning beginner like me! Thank's for the detailed implementation in this video!
This is the best Pytorch Tutorial I've ever learned
At 16:43, shouldn't the kernel size of the expansion layer by 1x1 and not 3x3? Thanks for the video!
I think you're right on this part, could you refer where you found this info?
@@AladdinPersson Hi, I think it is mentioned in the Table.1 of the original MobileNetv2 paper.
@@AladdinPersson at line 62, the kernel size of the expand_conv should be 1 instead of 3 because we're trying to increase the #channels using 1x1 Conv
github.com/lukemelas/EfficientNet-PyTorch/blob/7e8b0d312162f335785fb5dcfa1df29a75a1783a/efficientnet_pytorch/model.py
I'm super impressed with the pace at which you produce these tutorials... How much time goes into creating a video like this? Do you get it right with the first try? Amazing content!
It's quite varied, this one in particular was quite smooth and took me less than a day to go through the paper and implement it. Other implementations I've done, like the Yolo from scratch tutorial took me over a week
The absolute best channel. Thanks so much!
That was hard, but I've understood most of it.
Thank you!
16:20 I wonder whether you made a mistake when using reduced_dim = int(in_channels/reduction) instead of int(hidden_dim/reduction).
I think it's a mistake as well
Great content br...so brief and clear explanations
I was waiting for this implementation!! thanks, I would really like a video where you try to make a very good model for a computer vision Kaggle competitions.
Any competitions in particular?
@@AladdinPersson to be honest there are object detection github models and we can just clone it and apply to the data; but as in kaggle we have to improve it.
So i was thinking how would you approach an object detection competition and use different methods to get a higher score.
@@AladdinPerssondon't have any names in particular but the ones which allows you to showcase good amount of feature extraction and tricks in training. Also, a monthly one competition video or whatever that suits your routine on a long run would be a really informative playlist resource. Cheers.
@@AladdinPersson how about hubmap!.
Aladdin!! Awesome video! Very well explained, clear. Helped me understand network details that could not get from the paper video. BTW the paper video is also very helpful. :) Thank you! It would be amazing to have similar content on DenseNet
In other implementation, squeeze excitation is implemented with fully connected layers. Why are you using conv layers in your implementation?
Great implementation, please try student-teacher architecture (training method ) basically it's knowledge distillation
Thank you good sir, this was immensely helpful
Man your videos are amazing
I wonder if I’ll ever get to this level of understanding in deep learning.
You just see the refined version when you're watching the video. There's a lot of dumb mistakes, thing not working, me feeling like I'm never going to understand it behind every video. Stick with it whatever it is you're working on and you will improve 👊
Been watching videos from your channel for a while now, the concepts and implementations are very well explained!! Thanks for the amazing content! Is it possible to go through efficientdet as well (the efficientnet variant for object detection task) or DETR?
I think I want to do some more Kaggle a bit but will get back to those, which of those two do you think would be most interesting, EfficientDet or DETR? I think the idea of DETR is cool but the performance is not great
@@AladdinPersson yeah I agree maybe DETR would be more interesting I think integrating transformers into computer vision has become quite popular recently
Can you make a walk through+ model building on styleGan family as well. Good work again!
Thank you so much for all ur videos!
how about EfficientDet next?
I'll look into it ^^
Hi Aladdin,
Thank you! It is very helpful. I wanted to request if you could please do a similar video on Retina Net Algorithm? It does perform pretty well when compared to YOLO for vision systems, but there is not much out there on it. For a beginner, it will be very helpful! I appreciate the time and effort you put into these videos.
Bro when I was watching your awesome videos about Machine Learning i thought you were a invisible machine learning God, but now I finally see you have a face and are human :-D
thanks for the good video. can i ask something, in list of base_model, how you know the stride, for it was confusing since never explained in paper?. or this has calculation to determine the stride or other. thanks
Good video! But why do you use conv in Squeeze and Excitation? As far as I'm concerned they only use convolutions instead of fully-connected as a NoSqueeze block, which performs worse than the SeBlock.
Love the Video! Thank you so much for your content
It is assigned in its parent class , nn.Module.
hello, cool video.
I have a question, how do I implement a model in EfficientNET for depth estimation.
I already have my dataset. I just need to train the model. How do I do it using this?
Thank You
Amazing video! THanks :)
I would actually love to see an implementation video of the recent Nerf Paper: HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling
Cheers and keep the good videos coming! :)👏
awesome! can you do efficientdet from scratch next?
It's a good suggestion, I added it to my list. For now I want to do some Kaggles but I will revisit this idea
While working on your videos, I try to take notes on paper, while trying to understand the code. I'm having a hard time, but it's still fun. I'm a cs student. Is it normal that I don't have a hard time? Your from strach videos, especially on the efficientNet architecture, really challenge me. Do I have a deficiency or are these already difficult?
edit: I understand your videos and take notes of what you do. It's just that the process is difficult.
Great video!
@Aladdin Persson
Great video! I just have a small question. Can you please demonstrate how to calculate the actual model architecture for bigger models (B1,B2,etc.)? This bit is not clear (atleast not to me) and the newer paper also uses similar techniques, so it would be great to know.
Other than that, awesome video! I always learn so much watching these videos.
when I print out the parameters count it says 40 million parameters instead of 12 million like in the paper(b3 , 20 mill vs 7.8 mil for b0)
Hey Aladdin great video, I just want to ask that I wrote your code and check the summary of total param size and I found that (around)14 million for b0. The paper is showing 5 million, do I miss something? Thank you great work 👍
i also noticed it, it's a problem or what?
@Aladdin Persson
Could you please do a video about implementing EfficientNetV2 from scratch in pytorch?
thank you for creating such a great content,
I am working on a 3D medical image classification,
how do i change 2D efficientnet to 3D efficientnet?
which parameters should i change?
thank you.
instead of using conv2d change it to conv3d , from my knowledge this is the easiest way to implement one
Great video!
One question though: you said that a default 3x3 convolution is in fact a cube and that it takes 3 channels into account. But isn't it actually 3x3x(number of channels)? That's why the default value for groups is 1, which means that all channels are in that one group and are being taken into account.
You're correct, yeah my bad. Only the first convolution will be a cube (assuming RGB input)
@@AladdinPersson thanks for clarifying :)
Awesome 💫
can you please make a video on custom image classification using FixEfficientNet ,it would be a great help
thanks in advance
Hello,,your videos have been really helpful for me. As per my request,you made the object detection playlist and bangggg!!!!!,,,it was of great use to me.
Now here i am stuck in another problem.I am learning tensorflow and ofcourse following your tutorials,but I am stuck in something.Can you please put up a video tutorial on how to do data augmentation on a tf.data dataset consisting of images and its respective masks?
Can you please mention the github link for this code.
I have a question
Why are you implementing papers using pytorch not tensorflow ?
Can you tell us the differences between both of them ?
Another thing, there is a lite version of pytorch like the tensorflow one ?
Thank you
I am more productive in Pytorch and things seem to make more sense. Personally I don't see a benefit to me using Tensorflow for now. I feel that we're seeing a trend of people moving away from Tensorflow, internally Google has started the JAX project which seems to diverge from Tensorflow. OpenAI recently said that they are going to only use Pytorch moving forward. There are things lacking with regards to deployment with Pytorch but I think they are working on it with torchserve etc, with regards to TensorFlow lite I don't think there's an equivalent in Pytorch
27:04 i thought i got rickrolled for a second 🤣
Please make a video on how to train images from this scratch code
alright alright alright
alright
@@AladdinPersson your channel and Yannic Kilcher are da best bro
@Bảo Anh Phạm Ngọc: Same favourite channel as me
Hey man great work but i found one minor bug torch.randn() in stochastic_depth() sometimes returns numbers less than 0 or more than 1, this might be a problem so try using torch.FloatTensor(x.shape[0], 1, 1, 1).uniform_(0, 1) instead, as it stays in [0, 1] range hope this helps
Are you sure I wrote torch.randn and not torch.rand? Could you give time stamp?
@@AladdinPersson sorry my bad😅😅, you wrote torch.rand, my dumbass thought it was randn
Great video.Can you implement code for efficient as a encoder and UNet as a Decode for the following paper "A robust framework for glaucoma detection using CLAHE
and EfcientNet
module 'torch.nn' has no attribute 'SiLU'
I dont understand why I'm getting this error. can you please help me out
Update your torch version
@@AladdinPersson I updated to latest version but still it gives the same error but it worked for SELU. Can you suggest the version i should be using?
@@siddhantshete1744 I don't know what could be causing the error then.. The docs have it: pytorch.org/docs/stable/generated/torch.nn.SiLU.html
Awesome!
Thank you 👊
great work! like crushed and suscribed! can you also make a video for Bottleneck Transformers for Visual Recognition from google?
Thank you :)
like button smashed
I hope you will see my comment. can you please implement EfficientDet model? I know maybe it's long but please try to implement it in a video because I want so much. And thanks for that great content.
NotImplementedError: what should i do
Nice video
Guess I am just dumb...
Awesome!!!