Missing value handling using Prediction Model in Machine Learning | Data Cleaning Tutorial 9

How to Fill Missing Values in Dataset (Basics-Advanced Techniques) | Python

Exploratory Data Analysis with Pandas Python

หนังเต็มเรื่อง | ยุทธการหฤโหด | หนังสงคราม หนังแอคชั่น | พากย์ไทย HD

[TH] 2024 PMSL SEA Finals D2 | Fall | ต้องรักษาฟอร์ม อย่ายอมให้ใครแซง

BEARHOUSE รีแบรนด์ครั้งใหญ่??!! #ธุรกิจ #bearhouse #bearhug #nwfinance

Hands-on Handling missing value using Mean Median mode with Python | Data Cleaning Tutorial 8

Atul Patel

มุมมอง 12 091

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 8 ก.ย. 2024
During the Machine Learning Data Cleaning process, you will often need to figure out whether you have missing values in the data set, and if so, how to deal with it. In this video, I have demonstrated to handling the missing value using statistical way mean, median and mode. In this video I only cover the hands-on explanation using python :-
1. We impute the missing data for a quantitative attribute by the mean or median and for qualitative attribute by mode.
2. Generalized Imputation: In this case, we calculate the mean or median for all non missing values of that variable then replace missing value with mean or median.
3. Similar case Imputation: In this case, we calculate mean individually of non missing values then replace the missing value based on other variable.
Python Notebook : github.com/atu...

ความคิดเห็น • 13

@chiomaobiajulu4363 8 หลายเดือนก่อน
this is a very helpful video, I must admit. Nice work. I'd love to ask though, what do we do with the NaN gotten after using the groupby function? I mean, how can we replace it with a reasonable value afterwards?
@sakinaaali2696 ปีที่แล้ว
it really helped me. thank you.
@phuocnguyenngoc2197 2 ปีที่แล้ว
Thanks alot
@AtulPatelds 2 ปีที่แล้ว
Thank you
@sreeramsaravanan8132 3 ปีที่แล้ว ⁺¹
How to use group by imputation if we dont have domain knowledge on that particular dataset?
@AtulPatelds 3 ปีที่แล้ว ⁺¹
Hi, in reality Data-science is not to easy as we understand and how the hype is being created by many bloggers and training institutes. In real world data is very very messy to work on data science problem and I would say that without Domain knowledge you can make model but that model will work in production or not that probability is very very less. Because Its up to how well you understand your data and how much good information you can extract from that data to make a good model. So If you don't have domain knowledge than you can only do hit and trial strategy and I hope you also know that by using hit and trail methods you will also not be satisfied that you are going in right direction also that will take lots of time. I have seen many production deployment get failed due to lack of domain knowledge in Data science projects. As you also know that we spend our 70% time in data and feature engineering part because model creation is not to hard even that can be created by any fresher but main problem is that what quality of data we are feeding in model. As we know if we feed scrap data then we will get scarp model so I hope you will understand the importance of Domain knowledge.
It would always be suggested that we should have one domain expert who can guide us during featuring engineering part if we are not to good in that respected domain.So if you are not good in domain knowledge so you can take help from domain expert of any senior member in your team.
@terryterry3733 2 ปีที่แล้ว
in similar case imputation you took 10 + 15 / 2 =12.5 .. where this 2 is coming from . this is because u have only 2 values 10 and 15 ?
@AtulPatelds 2 ปีที่แล้ว
Correct
@harshavardhan6368 ปีที่แล้ว
why are u doing before test train split
@terryterry3733 3 ปีที่แล้ว
HI : in section 7 why did u use 0 after mode ? mode()[0]
@AtulPatelds 3 ปีที่แล้ว ⁺¹
Here we want mode of Data frame column and if we calculate mode for a column having several rows mode(x) can be an array as there can be multiple values with high frequency. That's why I used the mode[0] to select the first one and we always use this by default mode[0] at the end to select the highest frequency.
@shashankathawale7002 3 ปีที่แล้ว
Inline no 7 why did you write 0 before mode. could you please tell us about it?
@AtulPatelds 3 ปีที่แล้ว ⁺¹
Here we want mode of Data frame column and if we calculate the for a column having several rows mode(x) can be an array as there can be multiple values with high frequency. That's why I used the mode[0] to select the first one and we always use this by default mode[0] at the end to select the highest frequency.

ต่อไป

เล่นอัตโนมัติ

Missing value handling using Prediction Model in Machine Learning | Data Cleaning Tutorial 9

Missing value handling using Prediction Model in Machine Learning | Data Cleaning Tutorial 9

How to Fill Missing Values in Dataset (Basics-Advanced Techniques) | Python

How to Fill Missing Values in Dataset (Basics-Advanced Techniques) | Python

Exploratory Data Analysis with Pandas Python

Exploratory Data Analysis with Pandas Python

หนังเต็มเรื่อง | ยุทธการหฤโหด | หนังสงคราม หนังแอคชั่น | พากย์ไทย HD

หนังเต็มเรื่อง | ยุทธการหฤโหด | หนังสงคราม หนังแอคชั่น | พากย์ไทย HD

[TH] 2024 PMSL SEA Finals D2 | Fall | ต้องรักษาฟอร์ม อย่ายอมให้ใครแซง

[TH] 2024 PMSL SEA Finals D2 | Fall | ต้องรักษาฟอร์ม อย่ายอมให้ใครแซง

BEARHOUSE รีแบรนด์ครั้งใหญ่??!! #ธุรกิจ #bearhouse #bearhug #nwfinance

BEARHOUSE รีแบรนด์ครั้งใหญ่??!! #ธุรกิจ #bearhouse #bearhug #nwfinance

Electric Flying Bird with Hanging Wire Automatic for Ceiling Parrot

Electric Flying Bird with Hanging Wire Automatic for Ceiling Parrot

4.3. Handling Missing Values in Machine Learning | Imputation | Dropping

4.3. Handling Missing Values in Machine Learning | Imputation | Dropping

Handling Missing Value with Mean Median and Mode Explanation | Data Cleaning Tutorial 7

Handling Missing Value with Mean Median and Mode Explanation | Data Cleaning Tutorial 7

Handle Missing Values: Imputation using R ("mice") Explained

Handle Missing Values: Imputation using R ("mice") Explained

Outlier detection techniques(python)| how to avoid outliers without deleting it

Outlier detection techniques(python)| how to avoid outliers without deleting it

Handling Missing Values in Pandas Dataframe | GeeksforGeeks

Handling Missing Values in Pandas Dataframe | GeeksforGeeks

How to Detect and Fill Missing Values in Pandas (Python)

How to Detect and Fill Missing Values in Pandas (Python)

Data Cleaning Tutorial | Cleaning Data With Python and Pandas

Data Cleaning Tutorial | Cleaning Data With Python and Pandas

Daniel Chen: Cleaning and Tidying Data in Pandas | PyData DC 2018

Daniel Chen: Cleaning and Tidying Data in Pandas | PyData DC 2018

Master Data Cleaning Essentials on Excel in Just 10 Minutes

Master Data Cleaning Essentials on Excel in Just 10 Minutes

Minecraft ให้ชาวแก๊งมาหาปุ่มในแมพที่ผมสร้าง555+

Minecraft ให้ชาวแก๊งมาหาปุ่มในแมพที่ผมสร้าง555+

ทดสอบรองเท้าสตั๊ดสุดฮิต: จากปี 2000 ถึง 2024

ทดสอบรองเท้าสตั๊ดสุดฮิต: จากปี 2000 ถึง 2024

Linkin Park: FROM ZERO (Livestream)

Linkin Park: FROM ZERO (Livestream)

Don't Waste Water💧, Quick Fixes You Can Do Now!😸🩹 #catvideos #catmemes #trending

Don't Waste Water💧, Quick Fixes You Can Do Now!😸🩹 #catvideos #catmemes #trending

ONE 168 Full Fight | 7 ก.ย. 2567 | Ch7HD

ONE 168 Full Fight | 7 ก.ย. 2567 | Ch7HD

50 ชั่วโมง โรงเรียนรูปแบบสามเหลี่ยม!! แข่งสอบเลื่อนชั้น โคตรฮา!!

50 ชั่วโมง โรงเรียนรูปแบบสามเหลี่ยม!! แข่งสอบเลื่อนชั้น โคตรฮา!!

ร้องเพลงจีบสาว Ver.จ๊าบของแท้ - ใบมิ้นท์ ที่จริงใจ #จ๊าบของแท้ #bietheska #บี้เดอะสกา

ร้องเพลงจีบสาว Ver.จ๊าบของแท้ - ใบมิ้นท์ ที่จริงใจ #จ๊าบของแท้ #bietheska #บี้เดอะสกา

ผมเจอสิ่งที่ทำให้ ผมได้ผลผลิตมากขึ้นโดยการ |Minecraft #minecraft #มายคราฟ #fypシ #minecraftmemes #ตลก

ผมเจอสิ่งที่ทำให้ ผมได้ผลผลิตมากขึ้นโดยการ |Minecraft #minecraft #มายคราฟ #fypシ #minecraftmemes #ตลก