Data Pre-processing in R: Handling Missing Data

แชร์
ฝัง
  • เผยแพร่เมื่อ 11 ก.ย. 2024

ความคิดเห็น • 66

  • @DataProfessor
    @DataProfessor  4 ปีที่แล้ว +5

    🤔QUESTION OF THE DAY: How do you handle missing data? Comments down below! 😃
    💗Help support this TH-cam channel by hitting the Subscribe button, Like button and Comment down below! 👇

  • @HamedMorady-bu3ey
    @HamedMorady-bu3ey 4 หลายเดือนก่อน

    Your detailed and relaxing method of teaching helps me a lot, Thanks a bunch

  • @szymonk.7237
    @szymonk.7237 4 ปีที่แล้ว +2

    Thank you very much for your channel Professor Chanin. And this series will be really useful. Greetings from Poland !

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว

      Hi Szymon, It’s a pleasure! Thanks for your support and comment!

  • @douglaspiresmartins2955
    @douglaspiresmartins2955 4 ปีที่แล้ว +1

    Your vids are completely amazing. Please keep posting it, it helps me a lot. Greetings from Brazil!

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว

      Glad you like them! Appreciate your kind words 😃

  • @ruhinehri5607
    @ruhinehri5607 3 ปีที่แล้ว

    Very Nice... It was very good with basic knowledge covered so simply that all concepts cleared in one go.. Thanks a lot

  • @kanimozhipanneerselvam3017
    @kanimozhipanneerselvam3017 4 ปีที่แล้ว +1

    Crystal Clear Explanation!!!

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว

      Thanks for the kind words 😃

  • @deeppatel3454
    @deeppatel3454 4 ปีที่แล้ว +7

    Sir, can you make a another video on missing data of numerical type where some advanced techniques applicable when mean and median is not working.
    You are write thing, keep it up and make more video on R.

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว +1

      Hi Deep patel, you got it, next episodes of the ‘Data pre-processing series’ will have more advanced topic of handling missing data. Please subscribe and hit notification bell for update of latest video.

    • @Mukesh_Sablani
      @Mukesh_Sablani 4 ปีที่แล้ว

      very nice session sir its really helpful for me completing my project

    • @JYvman
      @JYvman 3 ปีที่แล้ว

      @@DataProfessor You lied.

    • @BORoundxbox
      @BORoundxbox 3 ปีที่แล้ว

      @@JYvman 😂😂

  • @marcofestu
    @marcofestu 4 ปีที่แล้ว +1

    Thanks for your video Chanin, I really appreciated it!! Eager to watch more of this^^

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว +1

      Marco Festugato It’s a pleasure, more videos on Data pre-processing is in the making. Please stay tune for more. P.S. I also referenced your name in the video description

    • @marcofestu
      @marcofestu 4 ปีที่แล้ว +1

      @@DataProfessor thank you very much, I appreciated it a lot! I've started studying R, there should be more youtube channels like yours, I really like it. Don't stop please! And consider writing me if you come to visit Italy :)

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว +1

      Marco Festugato Thanks for your kind words of inspiration and hope your job interview is successful!

  • @thanghoang1944
    @thanghoang1944 3 ปีที่แล้ว +1

    THIS IS SO HELPFUL!.

  • @lokesh542
    @lokesh542 4 ปีที่แล้ว +1

    Loved your way of explanation of things step by step it really helped :)
    Thanks

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว +1

      Thank you, Glad to hear that!

  • @wwpharmacist
    @wwpharmacist ปีที่แล้ว

    Thank you for your time

  • @bazi4517
    @bazi4517 4 ปีที่แล้ว +2

    Can you do a video regarding some of the important algorithms to master, like regression, classification etc? Trying to find information online can sometimes be overwhelming because they throw a million things at you at once. Thanks in advance.

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว +1

      Hi Bazi, thanks for your suggestion. I'll put this into the to-do list. In the meantime, please subscribe and hit the notification bell for updates of new videos.

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว +3

      Hi Bazi, following up on your suggestion, I've just released a video Data Science 101: Overview of Machine Learning Model Building Process at th-cam.com/video/BOk1hlCPW0c/w-d-xo.html
      I also referenced your name in the video description, check it out!

  • @desmondojei3868
    @desmondojei3868 4 ปีที่แล้ว +2

    Once again..Amazing video!!. However, at the end of the video you suggested to upload other videos on how to deal with missing values for other types of data which are not numerical such as factor, categorical , ordinal..Are you still going to upload these videos? Thanks

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว +2

      Thanks Desmond for the comment and kind words. Yes, definitely and thanks for reminding, I will put this in the priority to-do list. If there is any other topic you would like to see, feel free to suggest.

  • @cma9744
    @cma9744 4 ปีที่แล้ว +1

    Thanks for the video. Is that possible to introduce how to impute left-censored data (with multiple limits of detection)?

  • @dakshaudawatta8219
    @dakshaudawatta8219 4 ปีที่แล้ว +1

    It was really helpful sir. Thanks a lot ♥️

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว

      Thanks Daksha for the kind words! 😃

  • @gamalucianogen
    @gamalucianogen 3 ปีที่แล้ว

    How can I replace missing data with characters data?

  • @Katerina111111000
    @Katerina111111000 3 ปีที่แล้ว

    great video thank u!

  • @jeffersonjones7863
    @jeffersonjones7863 3 ปีที่แล้ว +2

    Thanks for the great video! This problem just occured to me ;)
    How do you deal with missing data for classification problems though?

    • @DataProfessor
      @DataProfessor  3 ปีที่แล้ว +2

      Great question! There's a mice R package that allows imputing both missing quantitative and qualitative values, the original paper describing the package is available at www.jstatsoft.org/article/view/v045i03

    • @jeffersonjones7863
      @jeffersonjones7863 3 ปีที่แล้ว +1

      @@DataProfessor Amazing, I'll have a look into that. thanks for the fast reply :)

    • @DataProfessor
      @DataProfessor  3 ปีที่แล้ว

      @@jeffersonjones7863 😊

  • @jiaweihu7928
    @jiaweihu7928 4 ปีที่แล้ว

    Hello Professor. I have a question which is related to missing data. I have a dataset cars_missing and there has two missing data. When I run sum(is.na(cars_missing)), it's only show 1. Also, When I run view(cars_missing), I can see in column "cubicinches" there has "NA", but in column "brand" there only show blank without NA. I think that's why when I run sum(is.na) there only show 1. Can you explain why?

  • @utkarshgupta2778
    @utkarshgupta2778 3 ปีที่แล้ว +1

    really nice video . but sir i am getting error while loading data
    Error in function (type, msg, asError = TRUE) :
    error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version

  • @ahmed007Jaber
    @ahmed007Jaber 2 ปีที่แล้ว

    Thank u for this. How to get previous working day date? Week starting on sunday and end on thurs day from 1 sep 2020 todate

  • @marcofestu
    @marcofestu 4 ปีที่แล้ว +1

    When is it better to replace NAs with mean? When with median?

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว +2

      It is difficult to say which is better than the other. If using mean for imputation, the mean value is preserved, meaning that the new resulting mean value would not be changed while this would cause perturbation to the median value; and vice versa for using median imputation. It should also be mentioned that either mean or median imputation ignores the relationship with other variables of the dataset. A better approach would be to perform regression imputation which would consider the relationship with other variables of the dataset. Further complicating this issue is that this discussion is valid for missing values that are numeric. If they are categorical then mean/media imputation would not be valid, thus favoring the logistic regression for predicting the categorical missing values (in a similar fashion to regression imputation for predicting numerical missing values).

    • @marcofestu
      @marcofestu 4 ปีที่แล้ว +1

      @@DataProfessor thank you for your professional answer :)

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว +2

      Marco Festugato You’re welcome Marco, I am already planning to make a video on regression and logistic imputation. Please stay tuned!

  • @dikshyasurvi6869
    @dikshyasurvi6869 4 ปีที่แล้ว +1

    I didn't get the colsums part. Where did it come from? Please reply fast, I have a project assignment.

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว

      colsums perform summation of each columns

    • @dikshyasurvi6869
      @dikshyasurvi6869 4 ปีที่แล้ว

      It's not working in my R. What do I do?

  • @andrewm6935
    @andrewm6935 3 ปีที่แล้ว +1

    Brilliant video! Clear and simple explanations for those just starting out in this area :)
    Did you ever make videos on how to deal with ordinal/categorical missing data? I can't seem to find them on your channel?

  • @govindhasamyc6460
    @govindhasamyc6460 4 ปีที่แล้ว +1

    Good and helpful

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว

      Thanks Govindha for the kind words and motivation 😃

  • @puremetalcore
    @puremetalcore 4 ปีที่แล้ว +1

    thank you!

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว

      You're welcome! Thank you for watching 😃

  • @kevinalejandro3121
    @kevinalejandro3121 3 ปีที่แล้ว +1

    ¿Which is better for data cleaning python or R?

    • @DataProfessor
      @DataProfessor  3 ปีที่แล้ว

      Both are equally good and capable for data cleaning.

  • @sahithisahithi5470
    @sahithisahithi5470 3 ปีที่แล้ว

    HOW CAN I ATTACH DATA SET

  • @herymoel6922
    @herymoel6922 3 ปีที่แล้ว

    How to handle missing in categorical data?

  • @abhipsatripathy3934
    @abhipsatripathy3934 4 ปีที่แล้ว

    Hello Prof, I am unable to find data pre-processing series... Please upload videos on how to clean data when it has missing values in factors.

    • @DataProfessor
      @DataProfessor  4 ปีที่แล้ว

      Hi, for imputing categorical data, I would recommend to use the mice R package. Please kindly refer to this for more information www.coursera.org/lecture/missing-data/mice-r-package-N6STE

    • @abhipsatripathy3934
      @abhipsatripathy3934 4 ปีที่แล้ว

      @@DataProfessor Okay thanku, I'll refer and get back to you if any problem arises.

  • @traveltune9340
    @traveltune9340 3 ปีที่แล้ว

    Why to introduce missing values of the data when it is already clean??..@Data professor

  • @dreambeautydreamus
    @dreambeautydreamus 4 ปีที่แล้ว +1

    Hi there I am your new friend
    It's very helpful
    Let's grow together..

  • @TarekFansa
    @TarekFansa 3 ปีที่แล้ว +1

    very good explanation ! and a very nice youtuber !
    i am Mathe Tutor on TH-cam
    much Love from Germany !

    • @DataProfessor
      @DataProfessor  3 ปีที่แล้ว

      Thanks! You have a great channel by the way.

    • @TarekFansa
      @TarekFansa 3 ปีที่แล้ว +1

      Thanks a lot for your kindness !