A Visual Introduction to Hoeffding's Inequality - Statistical Learning Theory

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 ต.ค. 2024

ความคิดเห็น • 24

  • @Kartik_C
    @Kartik_C  ปีที่แล้ว +6

    *CORRECTION*
    I used to think that the "population error" of a hypothesis (ie the expected value of the "training error" of that hypothesis) should be a constant, but it *NOT!* The population literally changes every 1000 years.
    (the universe is changing and everything is a dynamical system! )
    We should be *extremely* careful before assuming a "true population distribution" of the categories we are trying to study. Because false assumptions will perpetuate stereotypes.
    Please read more here:
    negativefeedbackloops.github.io/

  • @FsimulatorX
    @FsimulatorX 2 ปีที่แล้ว +7

    Hey Katrik, I just watched your video on Manifold learning and I think you have a special talent for representing abstract mathematical ideas using visual aid. Your videos get me more excited to learn about these topics in-depth. I look forward to seeing more content from you :)

    • @Kartik_C
      @Kartik_C  2 ปีที่แล้ว +3

      I'm very happy to hear that! Thank you :)

  • @Hoe-ssain
    @Hoe-ssain 9 หลายเดือนก่อน +2

    Great video. Always appreciate content that can easily and visually explain abstract mathematical concepts. Looking forward to seeing more.

    • @Kartik_C
      @Kartik_C  9 หลายเดือนก่อน

      Thank you!!

  • @pablos1609
    @pablos1609 22 วันที่ผ่านมา +1

    Great video, also thanks for leaving some resources in the description :)

    • @Kartik_C
      @Kartik_C  21 วันที่ผ่านมา

      thank you!

  • @magnus.discipulus
    @magnus.discipulus 4 หลายเดือนก่อน +1

    Hi Kartik! I have a question: when you talk about and visually represent "4 hypothesis" (6:37), are we talking about "4 distinct sets of hyperparameters"?
    The thing that throws me off a little bit is when you multiply by 4 the bound of Hoeffding's inequality. Do you multiply by 4 to compute the "probability that any of these hypothesis lead to an error greater than epsilon"? But why doing that? That is, why "putting them together"? Don't we generally evaluate hypothesis independently from one another?

    • @Kartik_C
      @Kartik_C  4 หลายเดือนก่อน

      Hi! I think we put them together to do a worst case analysis (the union bound )
      when we train a model using gradient descent for example, we search for the "right hypothesis" in a hypothisis class. (our hypothesis class has only 4 hypothisis in this case)
      So when we calculate the worst case scenario, we need to do it for the hypothesis class, and not each hypothesis individually. In other words, even though we evaluate each hypothesis independently, we want guarantees about the hypothesis class we are searching in..
      To add some context - the choice of a neural network (CNN, RNNs , LSTM, Linear Regression etc) decides which hypothesis class we will search in. In this toy example the hypothesis class has only 4 hypothisis.
      and the union bound is a very crude way of doing this, where we just add up the probability of "getting a bad dataset" for each of the 4 hypothisis.
      Hope this helps, I too found this part a bit tricky to understand.

  • @ninetailedkitsune324
    @ninetailedkitsune324 6 หลายเดือนก่อน

    This was extremely useful, thank you for your amazing video

    • @Kartik_C
      @Kartik_C  6 หลายเดือนก่อน

      Thank you so much! :)

  • @conceptualprogress
    @conceptualprogress 6 หลายเดือนก่อน

    Super hyper informative! Thank you!!

    • @Kartik_C
      @Kartik_C  6 หลายเดือนก่อน

      Thank you very much! :)

  • @71sephiroth
    @71sephiroth 2 ปีที่แล้ว +3

    Brilliant video! Mind asking how you calculated Etrue(h) = 0.6302 (1:19)? Mind explaining of how Union Bound works (8:34)? You mean the 'best case scenario' (10:30)?

    • @Kartik_C
      @Kartik_C  2 ปีที่แล้ว +2

      Thanks!!
      The Etrue(h) shown is calculated by using 40000 i.i.d sampled data points as a proxy for the "true toy distribution". It is not the "real" Etrue(h).
      The union bound is the "worst case scenario". We assume the worst that the "4 red areas" do not overlap, even though they do (which we see later in the video as we have the benefit of knowing the "true" distribution)

    • @71sephiroth
      @71sephiroth 2 ปีที่แล้ว +1

      @@Kartik_C Thank you!

  • @tsepten7930
    @tsepten7930 ปีที่แล้ว

    Sound effects are really good

    • @Kartik_C
      @Kartik_C  ปีที่แล้ว

      thank you! :)

  • @mestresplinter2467
    @mestresplinter2467 5 หลายเดือนก่อน

    Excelent explanation. Subscribed.

    • @Kartik_C
      @Kartik_C  5 หลายเดือนก่อน

      Thank you so much!

  • @manuelsebastianriosbeltran972
    @manuelsebastianriosbeltran972 ปีที่แล้ว +1

    Your videos are great!
    Keep it up

    • @Kartik_C
      @Kartik_C  ปีที่แล้ว

      Thank you :) 🙏

  • @samarthmotka4578
    @samarthmotka4578 5 หลายเดือนก่อน

    nice video

    • @Kartik_C
      @Kartik_C  5 หลายเดือนก่อน

      Thank you very much!