That was dope and really helped me a lot. I just blindly importing those all complex NNs all of my life and now it is the chance for me to build them myself....
I believe building things from scratch is the best thing to do if you really want to understand it. But building things from scratch also takes a lot of time so it's not always possible
Thanks! Just one suggestion. in GoogleNet class, you created many maxpool instance. I guess , its is sufficient to create one instance maxpool only which will get even more compact. By the way, great tutorials!
Thanks!! I am a korean student studying deep-learning i am very happy for watching your youtube!!! it is very useful for me~ I subscribe you and click like button have a happy new year Aladdin~
Hi Aladdin, thanks for sharing this awesome video! May I ask one question: what does the 4th column "depth" stand for and where is this parameter in your implementation reflected?
The depth column refers to how many times each layer in the rows has been used. There are a total of 22 layers with learnable parameters (27 if we count max pooling layers), if you sum all the non-zero numbers in that column you'll get 22.
Let us say the image is of dimension N x N and the filters are of dim K x K. Say that we pad the image with P zeros on either dimension. Then, we have the following relation that: Output size = ( (N + 2*P - K)/Stride ) + 1. Hope this helps...
Hi, this was an insanely helpful video. Do you have any general hyperparamter tips for Face verification / classification (learning rates, optimizer etc.)
Hi Aladdin I did not cover the Inception block. First, we create a branch1 layer that contains only 1 conv layer and then we added branch2 and branch3 end of the branch1. Am I wrong?
That sounds right, we did branch1 which was a single conv block containing of 1 conv layer (and batchnorm, relu). The inception block creates these four branches and concatenates all of them at the end, so branch1, branch2, branch3 and branch4 gets concatenated togethor (where each branch is using these conv blocks) and this is the final output of the Inception block.
Hi I did watch most videos in the meantime I did my custom dataset, used the googlenet model from this video and the code from his basic_Cnn video, but replaced the cnn with the googlenet, did a custom dataset with images as he showed in his other video, but got accuracy of 60 maximum Idk what to do to raise acuraccy Im a beginner, and i have a training set with 200 pictures of snakes of 2 species, and a test set of 40 pictures, 20 with a species and 20 for another Learning rate 0.0001 Batch 32 Like 60 epochs But got similar results with 5 epochs, but worse And idk if the number of pictures for train set is to small, maybe it will be better when i add more pictures, or maybe worse, of if the googlenet model is not ok to put with that script he had in the basic cnn video, and im looking for answers for this now: why the result is only 60, parameters, few pictures, the 2 scripts are not rly compatible
Plus i will work with like 400.000 pictures, so albumentation will only make is 2 million So i am not sure it s a good idea to do that, but if im wrong correct me
@@ikilledaguy6720 GoogLeNet has a lot of trainable parameters and 200 images are very few to train the model. You should increase your training data. Also, GoogLeNet is not fully implemented in the video.
Hey I came across this last week. Great tutorials. Thank you :)
Too good! so many concepts cleared
Such elegance!
That was dope and really helped me a lot. I just blindly importing those all complex NNs all of my life and now it is the chance for me to build them myself....
I believe building things from scratch is the best thing to do if you really want to understand it. But building things from scratch also takes a lot of time so it's not always possible
Thank you so much for making the video.
It works on my machine too
Amazing! Thanks a lot for your implementation!
Thanks! Just one suggestion. in GoogleNet class, you created many maxpool instance. I guess , its is sufficient to create one instance maxpool only which will get even more compact. By the way, great tutorials!
Yeah you're absolutely right.. Not sure why I did it that way
Thank you for your comment sir. I was just wondering about that.
your channel is gold! thanks!
I enjoy your explanation, thank you!
Thank you for this implementation :D
Kindly use some dataset to demonstrate it's performance
Brilliant! Thanks.
Thanks!!
I am a korean student studying deep-learning
i am very happy for watching your youtube!!!
it is very useful for me~
I subscribe you and click like button
have a happy new year Aladdin~
thank you...very helpful...to understand googlenet...
Thanks a lot!
I have a question
how to implement auxiliary classifier??
it is so difficult for me
Thank you so much man
Hi Aladdin, thanks for sharing this awesome video! May I ask one question: what does the 4th column "depth" stand for and where is this parameter in your implementation reflected?
That's a good question.. I'm not sure, looking at it I'm kinda confused over it myself, I'll come back and give you a better answer if I figure it out
@@AladdinPersson Thank you a lot for the tutorial. I appreciate it a lot! I have the same question and I couldn't fins the answer anywhere :(
The depth column refers to how many times each layer in the rows has been used. There are a total of 22 layers with learnable parameters (27 if we count max pooling layers), if you sum all the non-zero numbers in that column you'll get 22.
How did u infer the value for padding when its not given in the paper?
Great video but you should have explained the auxiliary classifier part in the video itself.
Thanks
How are you calculating the padding values?
Let us say the image is of dimension N x N and the filters are of dim K x K. Say that we pad the image with P zeros on either dimension. Then, we have the following relation that: Output size = ( (N + 2*P - K)/Stride ) + 1. Hope this helps...
it was a great video for me
Thanks for your effort and time keep going deeper :))
Hi, this was an insanely helpful video. Do you have any general hyperparamter tips for Face verification / classification (learning rates, optimizer etc.)
I really appreciate the kind words. Don't have any experience with face verification so I don't have any tips there unfortunately
Hi, I'm getting a "NotImplementedError" on the print(model(x).shape) any help is appreciated, I'm trying to understand the GoogLeNet
Thanks! :)
Hi I wanted to ask why was there no Relu activation function in the forward pass, is this something from the paper's design?
In __init__ function why dont yoy use nn.Sequential and pass all those layers?
Excellent work!!
i have learnt pytorch from you !!
thank you very much...
Hey! Why didn't you use a Softmax layer at the end?
It's included in CrossEntropyLoss
Hi Aladdin
I did not cover the Inception block. First, we create a branch1 layer that contains only 1 conv layer and then we added branch2 and branch3 end of the branch1. Am I wrong?
That sounds right, we did branch1 which was a single conv block containing of 1 conv layer (and batchnorm, relu). The inception block creates these four branches and concatenates all of them at the end, so branch1, branch2, branch3 and branch4 gets concatenated togethor (where each branch is using these conv blocks) and this is the final output of the Inception block.
hi
im a beginner
i understand you implemented the googleNet
but i dont understand what i have to do to use is on a dataset
Hey, why don't you watch previous videos of the playlist? He has explained how to do data augmentation, create data loaders and evaluate the network.
Hi
I did watch most videos in the meantime
I did my custom dataset, used the googlenet model from this video and the code from his basic_Cnn video, but replaced the cnn with the googlenet, did a custom dataset with images as he showed in his other video, but got accuracy of 60 maximum
Idk what to do to raise acuraccy
Im a beginner, and i have a training set with 200 pictures of snakes of 2 species, and a test set of 40 pictures, 20 with a species and 20 for another
Learning rate 0.0001
Batch 32
Like 60 epochs
But got similar results with 5 epochs, but worse
And idk if the number of pictures for train set is to small, maybe it will be better when i add more pictures, or maybe worse, of if the googlenet model is not ok to put with that script he had in the basic cnn video, and im looking for answers for this now: why the result is only 60, parameters, few pictures, the 2 scripts are not rly compatible
Plus i will work with like 400.000 pictures, so albumentation will only make is 2 million
So i am not sure it s a good idea to do that, but if im wrong correct me
@@abhishek-shrm look at replies pls i didnt know how to tag you
@@ikilledaguy6720 GoogLeNet has a lot of trainable parameters and 200 images are very few to train the model. You should increase your training data. Also, GoogLeNet is not fully implemented in the video.
Thanks man.. great