Active Learning. The Secret of Training Models Without Labels.

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 ต.ค. 2024

ความคิดเห็น • 54

  • @thecouchman2112
    @thecouchman2112 2 ปีที่แล้ว +12

    Really helpful video, thanks. One small thing though, the sound effects on the title screens were a bit loud imo :)

    • @underfitted
      @underfitted  2 ปีที่แล้ว +2

      Noted! Thanks for the feedback!

    • @underfitted
      @underfitted  2 ปีที่แล้ว

      GOOD ONE!

    • @emeebritto
      @emeebritto 4 หลายเดือนก่อน

      yaa... >.

  • @miguelduqueb7065
    @miguelduqueb7065 2 ปีที่แล้ว +3

    Nice video!
    You can also use a similar approach to compare models and stay with the one that performs best. Here is how:
    A few years ago I was collecting data in the chemistry lab in order to fit some models. Each experiment took 1 day to complete, so I started with a simple factorial design, fitted all models to the initial data set, and then predicted the point of maximum divergence between all models. That point was used as the next experiment and models we refitted thereafter. This procedure was repeated several times.
    Computing uncertainty in your predictions is similar, but only with one model.

  • @fikriansyahadzaka6647
    @fikriansyahadzaka6647 ปีที่แล้ว +2

    Nice video! Could you also explain about semi-supervised learning? There are not many videos that clearly explain about the progress so far in semi-supervised learning, even though the topic become more popular nowadays

  • @sahanakaweraniyagoda9866
    @sahanakaweraniyagoda9866 2 ปีที่แล้ว +3

    This is lit 🔥. Love this practical approach to Machine learning. Keep doing the amazing work 👏👏

    • @underfitted
      @underfitted  2 ปีที่แล้ว +1

      Thanks! Much more coming!

  • @hasanx8317
    @hasanx8317 3 หลายเดือนก่อน

    Duplicated records in the data has a significant meaning. It means that this repeatedly appearing record in the past is probably going to repeatedly appear in the future, it a VIP records, and knowing how to handle it well means you succeeded in high percentage of your supposed to do. So having duplicate data should some how eventually make the model very accurate in predicting it's related lable, more accurate than unique records.

  • @knutjagersberg381
    @knutjagersberg381 2 ปีที่แล้ว +5

    Love it, world class content! Also agree. A thought: Why not start with few shot or zero shot learning before active learning?

    • @underfitted
      @underfitted  2 ปีที่แล้ว +2

      If you have a model capable of zero-shot, absolutely!

  • @Param3021
    @Param3021 2 ปีที่แล้ว +2

    Another nice video!
    Learned a new concept - *Active Learning*

    • @underfitted
      @underfitted  2 ปีที่แล้ว +1

      Glad to hear that!

  • @tecbrain
    @tecbrain 3 หลายเดือนก่อน

    Fantástico vídeo. La verdad es que ahora voy a trabajar el código para entenderlo. Gracias por el trabajo que haces para ayudarnos.

  • @JoaquinRevello
    @JoaquinRevello ปีที่แล้ว

    Excellent Video. This channel is going to be huge soon

  • @maheshBasavaraju
    @maheshBasavaraju ปีที่แล้ว

    Loved the Idea of smart labelling. very cool

  • @lorenzoleongutierrez7927
    @lorenzoleongutierrez7927 ปีที่แล้ว

    Great explanation, thanks! Do you have some example of labeling services providing this approach?. greetings !

  • @erdi749
    @erdi749 ปีที่แล้ว +1

    I love your videos, nice and extremely informative! Just a quick comment: is it possible not to have those " bommmm!" soun?(: It make impossible to listen your videos in a car or with headphone. Thank you!

    • @underfitted
      @underfitted  ปีที่แล้ว

      Thinks, Erdi! Yes, if you watch my last few videos, I’ve improved the audio, including removing that particular sound 😏

  • @jayantghadge4027
    @jayantghadge4027 ปีที่แล้ว

    This method to me seems a little bit like boosting. I might be wrong though, but boosting is what came to my mind after watching the video.

  • @jubakala
    @jubakala ปีที่แล้ว

    Thanks! This was exactly what I needed at the moment! (:

  • @fobaogunkeye3551
    @fobaogunkeye3551 2 ปีที่แล้ว

    Lovely video Santiago! Quick question: How do we label the low confidence data that the model initially had a hard time predicting since we also didn't know what the label was in the first place. How do we know the label/class to use for that low confidence predicted data when we re-train ?

    • @underfitted
      @underfitted  2 ปีที่แล้ว

      We will start by labeling some of the data manually. The goal is to seed the process to start generating automatic labels.

  • @roshanaryal7786
    @roshanaryal7786 2 ปีที่แล้ว +1

    Hi, Santiago! Love your content!
    Could you please make a video on how to start machine learning as a beginner with some programming experience. I've been doing web dev but want to transit into ML. I will appreciate your response 😊

  • @123arskas
    @123arskas 2 ปีที่แล้ว

    I've some queries. There's no proper practical application of it is it? Since the paper talks about methods proposed along with practical issues.
    Since your videos are straight to the point and you try to keep it simple, just wanna know if you've found practical implementation of it in Python etc. Do give a link to it in the description. Thank you

    • @underfitted
      @underfitted  2 ปีที่แล้ว +2

      Yeah, I've personally used Active Learning multiple times. It's a very practical way to decide how to label a dataset.

  • @mahendrakumargohil6384
    @mahendrakumargohil6384 ปีที่แล้ว

    Excellent Information 👍👍

  • @vidyachandran944
    @vidyachandran944 ปีที่แล้ว

    Great content! Thank you :)

  • @jainamshroff4998
    @jainamshroff4998 ปีที่แล้ว

    A Very good video!

  • @brunoras
    @brunoras 2 ปีที่แล้ว

    Super insightfull, I`m using this ideas right now!

    • @123arskas
      @123arskas 2 ปีที่แล้ว +1

      If you've made it public (for smaller scale projects) please give the link to its repo. Thank you

    • @underfitted
      @underfitted  2 ปีที่แล้ว +1

      Wonderful!

  • @kutkut310
    @kutkut310 2 หลายเดือนก่อน

    Great Santiago, real data has never been so easy! LoL

  • @CarlosBCU
    @CarlosBCU 2 ปีที่แล้ว

    Hi, maybe a silly question but how you calculate the confidence after step 2?

    • @underfitted
      @underfitted  2 ปีที่แล้ว +1

      Assuming you are using a classification model, for example, that will be the confidence (probability) returned by the model. More specifically, the softmax value corresponding to the highest predicted class.

    • @CarlosBCU
      @CarlosBCU 2 ปีที่แล้ว

      @@underfitted many thanks for your answer! What if we are running a regression?

    • @modakad
      @modakad 3 หลายเดือนก่อน

      @@underfitted Answering CarlosBCU's question on confidence : I dont think your answer sufficiently clarifies the approach. Lets take an example. Suppose we have two classes, class 0, class 1. for observation A, softmax vector is [0.92,0.08] and for observation 2 its [0.60,0.40] {remember, Softmax gives a vector of values, which all add up to 1}. Which observation should we pick ? Not obs1. Obs2 is where the model has low confidence - as the model separates its predictions by only a magnitude of 0.2 (abs(0.6-0.4)) and in osb1, the separation is higher.

    • @modakad
      @modakad 3 หลายเดือนก่อน

      @@CarlosBCU I think the answer would be - choose the observations with higher error (RMSE, MSE etc.)

    • @modakad
      @modakad 3 หลายเดือนก่อน

      If you are using sigmoid loss function, then it would be trickier.

  • @Param3021
    @Param3021 2 ปีที่แล้ว +1

    1:03 - We need to Build a Model to Label the data we need, to Build a Model 🤯

  • @kemalariboga
    @kemalariboga 2 ปีที่แล้ว +1

    Great content!

  • @dimasveliz6745
    @dimasveliz6745 2 ปีที่แล้ว

    dynamic! Liked it more!

  • @juan.o.p.
    @juan.o.p. 2 ปีที่แล้ว

    Very interesting

  • @sodipepaul9370
    @sodipepaul9370 2 ปีที่แล้ว +1

    Wow.

  • @mateeurrehman-l6i
    @mateeurrehman-l6i 2 หลายเดือนก่อน

    Love the Content. Could you please make a video about Role of Entropy in this process. I just jumped from another video that had the concept.
    Video tutorial that i just watch : th-cam.com/video/6O5j7OhfQWE/w-d-xo.html
    I am basically qurious in learning how entropy play its part and how can it be improved by addressing multiple factors