Don't Replace Missing Values In Your Dataset.

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 ม.ค. 2025

ความคิดเห็น • 57

  • @kemalariboga
    @kemalariboga 2 ปีที่แล้ว +9

    Your posts (Twitter + TH-cam) are more helpful than any other content for gaining intuition about data. Brief and excellent! Thank you, Santiago!

  • @ashwinshetgaonkar6329
    @ashwinshetgaonkar6329 2 ปีที่แล้ว +11

    this channel will be a gem in times to come

    • @underfitted
      @underfitted  2 ปีที่แล้ว

      Thank you, Ashwin! Let's see what happens. Working hard on it!

  • @dimasveliz6745
    @dimasveliz6745 2 ปีที่แล้ว +4

    You're so brave man! For real, well done! Keep it up, we will follow !

    • @underfitted
      @underfitted  2 ปีที่แล้ว +1

      Thank you so much!

  • @hanweiz84
    @hanweiz84 2 ปีที่แล้ว +1

    Followed your twitter, signed up for bnomial once it was launched, and now I am in love with your channel :) Thank you for the value you are creating.

    • @underfitted
      @underfitted  2 ปีที่แล้ว +1

      Thanks so much for the support!

  • @adkh2112
    @adkh2112 2 ปีที่แล้ว +2

    Great content, clear, intuitive and to the point. Refreshing to see this kind of content not 30mins long...

  • @Fransphoenix
    @Fransphoenix ปีที่แล้ว

    Great video. Always telling my students this and really hoping they stay aware of this in the future!

  • @sandeeptuluri5996
    @sandeeptuluri5996 2 ปีที่แล้ว +2

    That's a great point I learned today..
    Thank you man....

  • @sahanakaweraniyagoda9866
    @sahanakaweraniyagoda9866 2 ปีที่แล้ว +1

    Super stuff 🔥🔥. Keep this thing rolling

  • @123arskas
    @123arskas 2 ปีที่แล้ว +2

    Awesome content. I have a suggestion. In your bnomial series sometimes the readers haven't covered a certain topic so it'd be helpful if after giving them the feedback you could link them to a good resource that explains that concept or may link them to one of these videos. It'd be a great help.

    • @underfitted
      @underfitted  2 ปีที่แล้ว

      Great suggestion!

    • @123arskas
      @123arskas 2 ปีที่แล้ว

      @@underfitted
      I'm sorry because you already provide references in your feedback but my intention was that the reference comes from interactive places that are easy to grasp or videos such as this channel of yours where we can easily understand them.
      Thank you

  • @dinosaursarecoolinnit5642
    @dinosaursarecoolinnit5642 14 วันที่ผ่านมา

    thanks i was so confused from the articles online i did not understand what they meant by flagging data. I am opting to use a Gradient boosted Tree model and i think it has built in methods to handle missing data but is that the same as flagging that you said?

  • @samarthsaxena1027
    @samarthsaxena1027 2 ปีที่แล้ว +1

    Incredibly insightful. Can counting the number of unanswered questions (e. 3,0,0,1,0,2...) work too?

    • @underfitted
      @underfitted  2 ปีที่แล้ว

      It definitely could! It depends on the specific problem and what information could help solve it.

  • @itsnaledixo3842
    @itsnaledixo3842 2 หลายเดือนก่อน

    How would adding another column help with missing values? please explain further

  • @edmundfreeman7203
    @edmundfreeman7203 2 ปีที่แล้ว

    I had a survey I was working with that had a bunch of check boxes, and the data was 1 or missing. This example pretty much blows up all standard methods.

  • @greyhat_gaming
    @greyhat_gaming 2 ปีที่แล้ว +2

    Superb insight!

    • @underfitted
      @underfitted  2 ปีที่แล้ว

      Glad it was helpful!

  • @MountainRaven1960
    @MountainRaven1960 11 หลายเดือนก่อน +3

    Missing data is still data.

  • @orochoYT
    @orochoYT ปีที่แล้ว

    I really loved your channel man

  • @rinogrego9262
    @rinogrego9262 2 ปีที่แล้ว

    Thank you. Rarest kind of advice in ML field that I ever got (not like I have been in the field for too long anyway, still an undergrad student). I have questions though. That means that given N columns-table, the maximum number of columns possible is 2N right? Also, what if we just replace the missing values of categorical columns with a new category? Do you think the idea/intuition still works? Because I think that adding columns might increase the cost especially in a very large table with massive amounts of both row and column.

    • @underfitted
      @underfitted  2 ปีที่แล้ว

      Rare advice is good. It means it makes you think :)
      I'm not sure I follow the idea with the 2N columns.
      The idea of the video is to avoid losing what could be important information: the absence of a value might be as important as the value itself.

  • @Param3021
    @Param3021 2 ปีที่แล้ว +1

    Another nice video!

    • @underfitted
      @underfitted  2 ปีที่แล้ว +1

      I think I answered this on Twitter. Here is what I said there:
      It depends on the problem. Sometimes, the best you can do is keep the missing values. Sometimes, replacing them is a better approach. Mean/Median/Mode is just one way to approach this problem.

    • @Param3021
      @Param3021 2 ปีที่แล้ว

      @@underfitted Yeah, I was about to edit it.
      Thanks for answering 🙂

  • @theDrewDag
    @theDrewDag 2 ปีที่แล้ว +1

    What's that keyboard? :D Btw man this content rocks. Don't stop.

    • @underfitted
      @underfitted  2 ปีที่แล้ว +1

      Thanks man! Really appreciate the comment!
      The keyboard is the MX Keys Mechanical Keyboard. The just released it.

    • @theDrewDag
      @theDrewDag 2 ปีที่แล้ว

      @@underfitted how's your experience with programming on it all day? Really looking into buying it!

  • @Arjun147gtk
    @Arjun147gtk 2 ปีที่แล้ว

    How much time do you spend to understand the data?

  • @ifeoluwaosasona7057
    @ifeoluwaosasona7057 2 ปีที่แล้ว +1

    This is soo good.

  • @shivu.sonwane4429
    @shivu.sonwane4429 2 ปีที่แล้ว +1

    Awesome 😎 as always
    from santiago import information

  • @diegofabianledesmamotta5139
    @diegofabianledesmamotta5139 2 ปีที่แล้ว

    How can I found you on Twitter?

  • @Aldotronix
    @Aldotronix 5 หลายเดือนก่อน

    in summary: add a missing indicator

  • @itsm0saan
    @itsm0saan 2 ปีที่แล้ว +2

    Cool!!

  • @abrahamowos
    @abrahamowos 2 ปีที่แล้ว +2

    And he said his videos aren't cool 🙄🙄

    • @underfitted
      @underfitted  2 ปีที่แล้ว +2

      I'm going to take this as a nice compliment :) Thanks!

  • @hasanx8317
    @hasanx8317 6 หลายเดือนก่อน

    Are you italian!?

    • @underfitted
      @underfitted  6 หลายเดือนก่อน +1

      Nope. Cuban origin

  • @jesuslopez3306
    @jesuslopez3306 2 ปีที่แล้ว +1

    Like for the fake takes! 🤣

    • @underfitted
      @underfitted  2 ปีที่แล้ว +1

      Ha ha, yeah... I have a ton. I enjoy looking at them, so I will keep adding them to the videos.

  • @premierjoseph9871
    @premierjoseph9871 2 ปีที่แล้ว +1

    I hope this channel never ends and keeps spreading happiness on Datascience And Machine Learning Concepts🤍🙏🏻..GO GO SANTIAGO🌟🌟🌟🌟🌟

    • @underfitted
      @underfitted  2 ปีที่แล้ว +1

      That's the plan! Thanks!