Dealing with Missing Data in R

แชร์
ฝัง
  • เผยแพร่เมื่อ 22 ส.ค. 2024

ความคิดเห็น • 8

  • @helena22hh
    @helena22hh วันที่ผ่านมา

    Fantastic video! Really really helpful and informative! I recommend! Thanks for your video!

  • @mangalahegde3805
    @mangalahegde3805 2 ปีที่แล้ว

    Woow.. This is wonderful.. Thank you for creating and sharing informative videos

  • @Philantrope
    @Philantrope 5 หลายเดือนก่อน

    Thanks for this thorough demonstration! I wonder what you think about what percentage of missing values is okay to do imputation. Also the number of available complete cases might be important. E.g. if I have 3.000 complete cases is it okay to impute 12.000 missing values in the other cases? Information on these considerations are rarely to be found.

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 หลายเดือนก่อน

    It would be nice to know where some of the functions you are using are coming from (without having to visit github). I cannot find locf, nobc or forbak in nomemica. I checked the zoo package. It does not have those but similar ones (na.locf for both LOCF and NOBC).

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 หลายเดือนก่อน

    Nice presentation. However, I find difficult to find a good account of the difference between the different classes of missings (MCAR, MAR, MNAR). After reading the description of these types of classes by different youtubers I am just left a loss. Perhaps no one can explain these things?

  • @warmtaylor
    @warmtaylor 11 หลายเดือนก่อน

    Thank you for your informative video!// At 15:03, I was wondering if you could provide me with reason(s) as to why data need to be normalised first before applying the KNN imputation. What would be consequence(s) if actual values are used for KNN imputation directly?// Are there quantitative method(s) which could be used to assess the accuracy of the imputation rather than visualisation? My data contains more than three thousand rows, so it is hard to assess the accuracy by using the three types of plotting described in the video.

    • @haraldurkarlsson1147
      @haraldurkarlsson1147 หลายเดือนก่อน

      I beliveve that if you have variables with different ranges (say 0 to 1) and (0 to 100) then you need to scale or normalized them before running kNN or one variable might dominate the other.

  • @abdulbouraa4529
    @abdulbouraa4529 10 หลายเดือนก่อน

    How to check the quality of the imputation with Mice?