Kapil Sachdeva
Kapil Sachdeva
  • 41
  • 362 130
Eliminate Grid Sensitivity | Bag of Freebies (Yolov4) | Essentials of Object Detection
This tutorial explains a training technique that helps in dealing with objects whose center lies on the boundaries of the grid cell in the feature map.
This technique falls under the "Bag of Freebies" category as it adds almost zero FLOPS (additional computation) to achieve higher accuracy during test time.
Pre-requisite:
Bounding Box Prediction
th-cam.com/video/-nLJyxhl8bY/w-d-xo.htmlsi=Fv7Bfgxd1I-atZF0
Important links:
Paper - arxiv.org/abs/2004.10934
Threads with a lot of discussion on this subject:
github.com/AlexeyAB/darknet/issues/3293
github.com/ultralytics/yolov5/issues/528
มุมมอง: 915

วีดีโอ

GIoU vs DIoU vs CIoU | Losses | Essentials of Object Detection
มุมมอง 3.4K10 หลายเดือนก่อน
This tutorial provides an in-depth and visual explanation of the three Bounding Box loss functions. Other than the loss functions you would be able to learn about computing per sample gradients using the new Pytorch API. Resources: Colab notebook colab.research.google.com/drive/1GAXn6tbd7rKZ1iuUK1pIom_R9rTH1eVU?usp=sharing Repo with results of training using different loss functions github.com/...
Feature Pyramid Network | Neck | Essentials of Object Detection
มุมมอง 11Kปีที่แล้ว
This tutorial explains the purpose of the neck component in the object detection neural networks. In this video, I explain the architecture that was specified in Feature Pyramid Network paper. Link to the paper [Feature Pyramid Network for object detection] arxiv.org/abs/1612.03144 The code snippets and full module implementation can be found in this colab notebook: colab.research.google.com/dr...
Bounding Box Prediction | Yolo | Essentials of Object Detection
มุมมอง 8Kปีที่แล้ว
This tutorial explains finer details about the bounding box coordinate predictions using visual cues.
Anchor Boxes | Essentials of Object Detection
มุมมอง 9Kปีที่แล้ว
This tutorial highlights challenges in object detection training, especially how to associate a predicted box with the ground truth box. It then shows and explains the need for injecting some domain/human knowledge as a starting point for the predicted box.
Intersection Over Union (IoU) | Essentials of Object Detection
มุมมอง 3.4Kปีที่แล้ว
This tutorial explains how to compute the similarity between 2 bounding boxes using Jaccard Index, commonly known as Intersection over Union in the field of object detection.
A Better Detection Head | Essentials of Object Detection
มุมมอง 1.9Kปีที่แล้ว
This is a continuation of the Detection Head tutorial that explains how to write the code such that you can avoid ugly indexing into the tensors and also have more maintainable and extensible components. It would beneficial to first watch the DetectionHead tutorial Link to the DetectionHead tutorial: th-cam.com/video/U6rpkdVm21E/w-d-xo.html Link to the Google Colab notebook: colab.research.goog...
Detection Head | Essentials of Object Detection
มุมมอง 4.5Kปีที่แล้ว
This tutorial shows you how to make the detection head(s) that takes features from the backbone or the neck. Link to the Google Colab notebook: colab.research.google.com/drive/1KwmWRAsZPBK6G4zQ6JPAbfWEFulVTtRI?usp=sharing
Reshape,Permute,Squeeze,Unsqueeze made simple using einops | The Gems
มุมมอง 4.2Kปีที่แล้ว
This tutorial introduces to you a fantastic library called einops. Einops provides a consistent API to do reshape, permute, squeeze, unsqueeze and enhances the readabilty of your tensor operations. einops.rocks/ Google colab notebook that has examples shown in the tutorial: colab.research.google.com/drive/1aWZpF11z28KlgJZRz8-yE0kfdLCcY2d3?usp=sharing
Image & Bounding Box Augmentation using Albumentations | Essentials of Object Detection
มุมมอง 6Kปีที่แล้ว
This tutorial explains how to do image pre-processing and data augmentation using Albumentations library. Google Colab notebook: colab.research.google.com/drive/1FoQKHuYuuKNyDLJD35-diXW4435DTbJp?usp=sharing
Bounding Box Formats | Essentials of Object Detection
มุมมอง 5Kปีที่แล้ว
This tutorial goes over various bounding box formats used in Object Detection. Link the Google Colab notebook: colab.research.google.com/drive/1GQTmjBuixxo_67WbvwNp2PdCEEsheE9s?usp=sharing
Object Detection introduction and an overview | Essentials of Object Detection
มุมมอง 7Kปีที่แล้ว
This is an introductory video on object detection which is a computer vision task to localize and identify objects in images. Notes - * I have intentionally not talked about 2-stage detectors. * There will be follow-up tutorials that dedicated to individual concepts
Softmax (with Temperature) | Essentials of ML
มุมมอง 3.3K2 ปีที่แล้ว
A visual explanation of why, what, and how of softmax function. Also as a bonus is explained the notion of temperature.
Grouped Convolution - Visually Explained + PyTorch/numpy code | Essentials of ML
มุมมอง 4.3K2 ปีที่แล้ว
In this tutorial, the need & mechanics behind Grouped Convolution is explained with visual cues. Then the understanding is validated by looking at the weights generated by the PyTorch Conv layer and by performing the operations manually using NumPy. Google colab notebook: colab.research.google.com/drive/1AUrTK622287NaKHij0YqOCvcdi6gVxhc?usp=sharing Playlist: th-cam.com/video/6SizUUfY3Qo/w-d-xo....
Convolution, Kernels and Filters - Visually Explained + PyTorch/numpy code | Essentials of ML
มุมมอง 1.9K2 ปีที่แล้ว
This tutorial explains (provide proofs using code) the components & operations in a convolutional layer in neural networks. The difference between Kernel and Filter is clarified as well. The tutorial also points out that not all kernels convolve/correlate with all input channels. This seems to be a common misunderstanding for many people. Hopefully, this visual and code example can help show th...
Matching patterns using Cross-Correlation | Essentials of ML
มุมมอง 1K2 ปีที่แล้ว
Matching patterns using Cross-Correlation | Essentials of ML
Let's make the Correlation Machine | Essentials of ML
มุมมอง 1.6K2 ปีที่แล้ว
Let's make the Correlation Machine | Essentials of ML
Reparameterization Trick - WHY & BUILDING BLOCKS EXPLAINED!
มุมมอง 10K2 ปีที่แล้ว
Reparameterization Trick - WHY & BUILDING BLOCKS EXPLAINED!
Variational Autoencoder - VISUALLY EXPLAINED!
มุมมอง 11K2 ปีที่แล้ว
Variational Autoencoder - VISUALLY EXPLAINED!
Probabilistic Programming - FOUNDATIONS & COMPREHENSIVE REVIEW!
มุมมอง 4.6K2 ปีที่แล้ว
Probabilistic Programming - FOUNDATIONS & COMPREHENSIVE REVIEW!
Metropolis-Hastings - VISUALLY EXPLAINED!
มุมมอง 30K2 ปีที่แล้ว
Metropolis-Hastings - VISUALLY EXPLAINED!
Markov Chains - VISUALLY EXPLAINED + History!
มุมมอง 11K2 ปีที่แล้ว
Markov Chains - VISUALLY EXPLAINED History!
Monte Carlo Methods - VISUALLY EXPLAINED!
มุมมอง 4K2 ปีที่แล้ว
Monte Carlo Methods - VISUALLY EXPLAINED!
Conjugate Prior - Use & Limitations CLEARLY EXPLAINED!
มุมมอง 2.9K2 ปีที่แล้ว
Conjugate Prior - Use & Limitations CLEARLY EXPLAINED!
How to Read & Make Graphical Models?
มุมมอง 2.8K2 ปีที่แล้ว
How to Read & Make Graphical Models?
Posterior Predictive Distribution - Proper Bayesian Treatment!
มุมมอง 5K2 ปีที่แล้ว
Posterior Predictive Distribution - Proper Bayesian Treatment!
Sum Rule, Product Rule, Joint & Marginal Probability - CLEARLY EXPLAINED with EXAMPLES!
มุมมอง 5K2 ปีที่แล้ว
Sum Rule, Product Rule, Joint & Marginal Probability - CLEARLY EXPLAINED with EXAMPLES!
Noise-Contrastive Estimation - CLEARLY EXPLAINED!
มุมมอง 10K3 ปีที่แล้ว
Noise-Contrastive Estimation - CLEARLY EXPLAINED!
Bayesian Curve Fitting - Your First Baby Steps!
มุมมอง 6K3 ปีที่แล้ว
Bayesian Curve Fitting - Your First Baby Steps!
Maximum Likelihood Estimation - THINK PROBABILITY FIRST!
มุมมอง 6K3 ปีที่แล้ว
Maximum Likelihood Estimation - THINK PROBABILITY FIRST!

ความคิดเห็น

  • @sumanpaudel1997
    @sumanpaudel1997 วันที่ผ่านมา

    Hi, I have got micrsoft form recognizer api which gives bounding box of 8 coordinates for a given class, how to draw bounding box using that. for eg: bounding_regions=[BoundingRegion(page_number=1, polygon=[Point(x=33.0, y=496.0), Point(x=169.0, y=496.0), Point(x=168.0, y=532.0), Point(x=33.0, y=532.0)])] they haven't provided in the documentation as well, if you could help, I would appreciate it. I have converted it into list like this [33.0, 496.0, 169.0, 496.0, 168.0, 532.0, 33.0, 532.0] but don't how to plot.

  • @manueljohnson1354
    @manueljohnson1354 วันที่ผ่านมา

    Excellent

  • @hugobertrand7348
    @hugobertrand7348 วันที่ผ่านมา

    Thank you for these very clear and visually efficient explanations. I'll make sure to use these concepts in my PhD work !

  • @zhoudan4387
    @zhoudan4387 5 วันที่ผ่านมา

    I thought temperature was like getting a fewer and saying random things:)

    • @KapilSachdeva
      @KapilSachdeva 5 วันที่ผ่านมา

      Depends on the context. Here it is about logits. In LLM apis it is to control the stochasticity/randomness.

  • @tuna5287
    @tuna5287 5 วันที่ผ่านมา

    The best! Thank you sir

  • @AdityaPrakash-nk9gc
    @AdityaPrakash-nk9gc 8 วันที่ผ่านมา

    At 5:01 could you please explain why is it [1,5] and not [5,1]? Shouldn't the coordinates be in (x,y) format?

    • @KapilSachdeva
      @KapilSachdeva 8 วันที่ผ่านมา

      No the coordinates are in [y,x] … nothing specific about it as such, just a convention used in all object detection models.

  • @sashalyuklyan5195
    @sashalyuklyan5195 9 วันที่ผ่านมา

    Thank you a lot for your videos! Selection of subjects in your series is excellent, every tutorial offers very interesting information.

  • @Daydream_Dynamo
    @Daydream_Dynamo 9 วันที่ผ่านมา

    One stupid question here, Why we were interested in finding max joint probability, in the first place?? were there any other way to find w and beta??

  • @Daydream_Dynamo
    @Daydream_Dynamo 10 วันที่ผ่านมา

    It learns the parameter right?

  • @somasundaramsankaranarayan4592
    @somasundaramsankaranarayan4592 20 วันที่ผ่านมา

    At 6:39, the distribution p_\theta(x|z) cannot have mean mu and stddev sigma as the mean and std dev live in the latent space (the space of z) and x lives in the input space.

  • @yli6050
    @yli6050 22 วันที่ผ่านมา

    Watching your videos keeps reminding me of the phrase “a picture is worth a thousand words”, to which I want to add “ a great picture is worth thousands in gold”. Many times I had to freeze the video to let a particular moment sink in, because I couldn’t believe the insight that picture brings out . ❤❤❤

  • @yli6050
    @yli6050 22 วันที่ผ่านมา

    I am grateful to your lectures ❤ what a wonderful service you’ve done to all the learners

  • @dhairy-kumar-learn
    @dhairy-kumar-learn 22 วันที่ผ่านมา

    So, in-depth and with those visualization it is a grate learning experience

  • @user-lr6xs8dn8k
    @user-lr6xs8dn8k 22 วันที่ผ่านมา

    In real life we don't know Target distribution - f(x). How did you calculated alpha for various sample points ? f(Xt+1)/f(Xt)

  • @kask198
    @kask198 25 วันที่ผ่านมา

    We are okay with imperfections as long as they are useful to us ... great wisdom🙏

  • @desmondteo855
    @desmondteo855 หลายเดือนก่อน

    Amazing. Thanks for posting.

  • @kadrimufti4295
    @kadrimufti4295 หลายเดือนก่อน

    At the 4:45 mark, how did you expand the third term Expectation into its integral form in that way? How is it an "expectation with respect to z" when there is no z but only x?

  • @SanjaliRoy
    @SanjaliRoy หลายเดือนก่อน

    my feedback is that this is amazing!!! wish my ML prof taught this :( this truly is one of the few videos that breaks the sum rule, product rule, join and marginal probability down so well

  • @HellDevRisen
    @HellDevRisen หลายเดือนก่อน

    Great video; thank you :)

  • @jiahao2709
    @jiahao2709 หลายเดือนก่อน

    How you plot this? which software you are using?

  • @user-mu2ml3zs4v
    @user-mu2ml3zs4v หลายเดือนก่อน

    THANK TYOY SO MUCH

  • @technicallittlemaster8793
    @technicallittlemaster8793 หลายเดือนก่อน

    I am worried that he hasn't uploaded videos in 7 months.... is everything alright?

    • @KapilSachdeva
      @KapilSachdeva หลายเดือนก่อน

      He is really sorry about it and feels miserable that he is not being of service to others; … some diagnosis reveals that he is suffering from overthinking and laziness. 😢

  • @medihazukic5382
    @medihazukic5382 หลายเดือนก่อน

    Amazing lectures, thank you so much! I was wondering at 12:07 the partial derivatives of E wrt w_i (in the right panel), shouldn't those be the partial derivatives of ys and not Es?

    • @KapilSachdeva
      @KapilSachdeva หลายเดือนก่อน

      No it is E, the error function. Our goal is to minimize the error.

  • @kask198
    @kask198 หลายเดือนก่อน

    Thank you, very good overview. You must be having thorough understanding of many object detection models to deliver this kind of overview. I have one question (only for discussion): How it is "clear" (1:22) that object detection is difficult task for machines? I think it is important to mention why the problem is difficult (challenges) to solve from computer vision point of view. You did mention a couple of challenges at 10:40 but these are w.r.to DL approach.

    • @KapilSachdeva
      @KapilSachdeva หลายเดือนก่อน

      Difficult if you compare it to classification problem. Where an image either belongs to class 1 or classes x. I called it difficult because of 3 reasons - you have to do localization and classification and the fact that the number of objects are variable.

  • @Kn1ghtCh4ser
    @Kn1ghtCh4ser หลายเดือนก่อน

    You just saved a college student living in South Korea! Thanks for amazing visualization and explanations!

  • @user-px9zz3fo9o
    @user-px9zz3fo9o หลายเดือนก่อน

    best MCMC video I have seen

  • @arpansrivastava6405
    @arpansrivastava6405 หลายเดือนก่อน

    1:31 how do we not know how to sample as the distribution function is already given and you also plotted it ?

    • @KapilSachdeva
      @KapilSachdeva หลายเดือนก่อน

      Somehow it is a common confusion for many. Knowing a distribution function can only help you find the probability of a sample. Sampling from a function is a different task. Sampling means asking your computer to generate a sample. Sampling makes use of random number generator. Now your random number generator (algorithm) is to behave in such a way that the samples (random numbers) are generated in accordance with their prob distribution. Some samples are supposed to be more (the one with high prob) and some less. This is why various sampling algos/techniques are created. Your computer by default can give uniform random numbers. Most of the algorithms directly or indirectly manipulate the result of uniform random number generator.

  • @ajazhussain2919
    @ajazhussain2919 หลายเดือนก่อน

    thanks

  • @sanjaykrish8719
    @sanjaykrish8719 หลายเดือนก่อน

    This is pure gold ❤

  • @RalphDratman
    @RalphDratman 2 หลายเดือนก่อน

    5:35 Bayes Rule ... intractable computation.

  • @Daily_language
    @Daily_language 2 หลายเดือนก่อน

    Clear and easy to understand compared to other videos throwing lots of math formula at the begiinning. Great work! subscribed your channel

  • @ilke3395
    @ilke3395 2 หลายเดือนก่อน

    Thank you so much sir for the amazing explanation and visualization.

  • @yogeshwarshendye4857
    @yogeshwarshendye4857 2 หลายเดือนก่อน

    If done with UNet, it won't require upsampling as we concatenate the layers right?

  • @rahulbirari401
    @rahulbirari401 2 หลายเดือนก่อน

    Just Mind-blowing loved the way you explained concepts and slowly built on it, that's how inventions and human mind works and mathematics is just a tool to realize/record complex ideas. Other lectures directly jump into maths without explaining the idea.

  • @abdulwasaye8511
    @abdulwasaye8511 2 หลายเดือนก่อน

    Still you didn’t explain the need of CPDF, also you told that we want to get rid of evaluating integral therefore we want something…? What CPDF , but CPDF is integral of pdf

  • @srinathkumar1452
    @srinathkumar1452 2 หลายเดือนก่อน

    Spectacular 👏

  • @josemariapereziquierdo5939
    @josemariapereziquierdo5939 2 หลายเดือนก่อน

    Thanks a lot for this clarifying video.

  • @andrzejreinke
    @andrzejreinke 2 หลายเดือนก่อน

    this explanation was amazing! thanks! sub!

  • @vincentpelletier1246
    @vincentpelletier1246 2 หลายเดือนก่อน

    I don't know if I got this wrong but if I take a 1x64x26x26 feature through a convolution that has a K=3 and S=1, I will definitely not end up with a 1x64x26x26, but with a 1x64x24x24. To achieve the desired shape would require a P=1. If I'm not correct, would someone please explain how the dimensions would work in this case?

  • @vero811
    @vero811 2 หลายเดือนก่อน

    I didn't get a headache!

  • @haihoangthanh8949
    @haihoangthanh8949 2 หลายเดือนก่อน

    thank you for your detail and understandable explanation. The video animations are great and impressive.

  • @user-tu5kn6ck2y
    @user-tu5kn6ck2y 2 หลายเดือนก่อน

    Kapil, I will be forever greatful to you for this lecture series.

  • @anilsarode6164
    @anilsarode6164 2 หลายเดือนก่อน

    Great Video !! Thanks.

  • @Ashish-sp4hw
    @Ashish-sp4hw 2 หลายเดือนก่อน

    Great content. Will you be able to give a quick walkthrough of any one of the code base like yolov7,8,9 etc. ?

  • @kalinduSekara
    @kalinduSekara 2 หลายเดือนก่อน

    Greate explanation

  • @dinoyjohny5211
    @dinoyjohny5211 2 หลายเดือนก่อน

    Thankyou so much

  • @Ashish-sp4hw
    @Ashish-sp4hw 2 หลายเดือนก่อน

    Well explained.

  • @anshumansinha5874
    @anshumansinha5874 2 หลายเดือนก่อน

    Thank you for the content. I am a bit confused about the motivation, is it sampling? like being able to sample from a distribution? or estimating probability distribution? Because, for sampling, we would first need a distribution. On the other hand for density estimation, we would need to sample (like in VAE). So I am confused about what comes first? Or are these the same problems described in different ways? Thanks.

  • @zlatasmolyaninova1140
    @zlatasmolyaninova1140 2 หลายเดือนก่อน

    this is the most clear explanation among all i have read/watched! thank you very much

  • @dave11a82
    @dave11a82 2 หลายเดือนก่อน

    Really well explained. Thanks for this tutorial!