Correct Data Science setup for Arm Macs (M1/M2)

4.3. Handling Missing Values in Machine Learning | Imputation | Dropping

All-in-one Data Preparation | Missing Values | Outliers | Scaling | Multicollinearity | Encoding

😺🍫 خدعة الشوكولاتة المذهلة لقطتي! شاهد كيف تعلمني قطتي القيام بها! 😂🎉

Enceinte et en Bazard: Les Chroniques du Nettoyage ! 🚽✨

มายคราฟแต่ถ้าผมพูด "เอฟเฟกต์" อะไรมันจะออกมา!? (ภาค 2)

Advanced missing values imputation technique to supercharge your training data.

Danil Zherebtsov

มุมมอง 2 096

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 11 ธ.ค. 2024

ความคิดเห็น • 25

@soccerdadsg ปีที่แล้ว
Absolutely love this library!
@akmalmir8531 ปีที่แล้ว
Danil thank you for sharing, interesting library, one idea would be best if next time we could compare like :
1) mean imputation
2) dropping
3) ML
and then fit and predict any model to data at the end we can compare in which imputation RMSE is in minimum
@lifecrunch ปีที่แล้ว
Did such comparison many times. Although it is very much dependent on the data, but on average the ML missing values imputation yields better results.
@akmalmir8531 ปีที่แล้ว
@@lifecrunch Yes agree, that's why i am writing to show to you viewers that you idea works better than simple imputation, like you are giving gold to them, it would ne better if you give comparison at the end
@lifecrunch ปีที่แล้ว
Agree, this would be a great illustration of the concept.
@akshu7832 5 วันที่ผ่านมา
Informative
@likhithp9934 7 หลายเดือนก่อน
Nice Work man
@lifecrunch 7 หลายเดือนก่อน
Thanks 🔥
@nawaz_haider ปีที่แล้ว ⁺¹
I'm learning Data Science, and most tutorials just use the mean value. This didn't make any sense to me. I was wondering how on earth their model works in the real world with all these wrong values that have been used during training. Now I see what pros do.
@lifecrunch ปีที่แล้ว ⁺¹
Yeah, the naive (mean) approach just works technically. It’s used to fill in the blanks so the models which can’t handle NaN could train. But the volume of incorrectly filled missing values will directly reflect the model’s generalization.
@mkaya4677 2 หลายเดือนก่อน
Hi,
First of all, your video provides very useful information, and I want to thank you for that. I have a question I would like to ask you.
I am analyzing air pollution in a city in my country. For this purpose, I have created a dataset using air pollution data and meteorological data. I then organized these data into hourly intervals. However, I encountered a problem. My dataset contains null values. These null values appear consecutively in some parts of the dataset. For example, in the first 3000 rows, there are approximately 2500 null values for the NO2, NOX, and NO air pollutants, but in the remaining part of the dataset, there are very few null values. In addition, there are rows where data for all air pollutants are missing, but these rows cover a short period consecutively. I believe this might be due to workers turning off the devices after working hours on certain days. I have previously trained a few models to fill in these missing values, but I did not achieve good results. I would like to ask for your guidance. In these two cases, should I fill in the missing data or exclude them from the dataset? What would be the most accurate method to complete these missing values?
@lifecrunch 2 หลายเดือนก่อน
In the first place (a lot of consecutive missing values at the top) I would just drop them.
As for those NaNs in the middle, since your data is a time series, I would use something like a rolling window or nearest neighbors values to fill in the blank spots.
@prestonryan3734 หลายเดือนก่อน
Absolute mad lad
@lifecrunch หลายเดือนก่อน
😎
@tnwu4350 7 หลายเดือนก่อน
Hi there this is an awesome approch for imputation. How would you go about validating this though? It would be helpful to demonstrate that its more accurate than methods like simple or iterative imputer
@lifecrunch 7 หลายเดือนก่อน
I have benchmarked this approach to iterative imputer along with all statistical methods. Every time verstack.NaNImputer gave better results, especially comparing to statistical methods. And there's really no magic - a sophisticated model like lightgbm is a golden standard when it comes to tabular data.
@anmolchhetri3033 หลายเดือนก่อน ⁺¹
very helpful thanks, But is it require to do hyperparameter tuning of lightgbm models?
@lifecrunch หลายเดือนก่อน
For the purpose of missing values imputation - not necessary. Tuning can give a subtle accuracy improvement and it’s justified for an actual prediction model, but I wouldn’t do it for a data processing step.
@yolomc2 8 หลายเดือนก่อน
is possible to get copy of the code to study sir ? thanks in advnance 👌👍
@lifecrunch 8 หลายเดือนก่อน ⁺¹
Unfortunately didn't save the code from this video... You can code along, the script is not very complicated.
@yolomc2 8 หลายเดือนก่อน
@@lifecrunch 👍
@AlexErdem-lo5rz 6 หลายเดือนก่อน
Thank you!
@lifecrunch 6 หลายเดือนก่อน
Welcome!
@kalyanchatterjee8624 5 หลายเดือนก่อน
Great, but I am not the right audience. Too fast.
@lifecrunch 5 หลายเดือนก่อน ⁺¹
You’ll get there…

ต่อไป

เล่นอัตโนมัติ

Correct Data Science setup for Arm Macs (M1/M2)

Correct Data Science setup for Arm Macs (M1/M2)

4.3. Handling Missing Values in Machine Learning | Imputation | Dropping

4.3. Handling Missing Values in Machine Learning | Imputation | Dropping

All-in-one Data Preparation | Missing Values | Outliers | Scaling | Multicollinearity | Encoding

All-in-one Data Preparation | Missing Values | Outliers | Scaling | Multicollinearity | Encoding

😺🍫 خدعة الشوكولاتة المذهلة لقطتي! شاهد كيف تعلمني قطتي القيام بها! 😂🎉

😺🍫 خدعة الشوكولاتة المذهلة لقطتي! شاهد كيف تعلمني قطتي القيام بها! 😂🎉

Enceinte et en Bazard: Les Chroniques du Nettoyage ! 🚽✨

Enceinte et en Bazard: Les Chroniques du Nettoyage ! 🚽✨

มายคราฟแต่ถ้าผมพูด "เอฟเฟกต์" อะไรมันจะออกมา!? (ภาค 2)

มายคราฟแต่ถ้าผมพูด "เอฟเฟกต์" อะไรมันจะออกมา!? (ภาค 2)

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 1 Final

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 1 Final

Handling Missing Data in Python: Simple Imputer in Python for Machine Learning

Handling Missing Data in Python: Simple Imputer in Python for Machine Learning

All about missing value imputation techniques | missing value imputation in machine learning

All about missing value imputation techniques | missing value imputation in machine learning

Impute missing values using Iterative Imputer | Simple Imputer | sklearn | pandas

Impute missing values using Iterative Imputer | Simple Imputer | sklearn | pandas

Professor Thomas Lumley: Multiple Imputation with machine learning

Professor Thomas Lumley: Multiple Imputation with machine learning

Kaggle Data Science Competition Course - Solve Three Challenges Step-by-Step

Kaggle Data Science Competition Course – Solve Three Challenges Step-by-Step

How do I handle missing values in pandas?

How do I handle missing values in pandas?

Handling Missing Data and Missing Values in R Programming | NA Values, Imputation, naniar Package

Handling Missing Data and Missing Values in R Programming | NA Values, Imputation, naniar Package

Why You Should Think Twice Before Using Returns in Python

Why You Should Think Twice Before Using Returns in Python

Python Pandas Tutorial (Part 9): Cleaning Data - Casting Datatypes and Handling Missing Values

Python Pandas Tutorial (Part 9): Cleaning Data - Casting Datatypes and Handling Missing Values

ทุกที่ทุกเวลา #ไมค์จิ้ม #ไมค์ใหญ่มาก 😂 🎤 #แมทธิว #matthew #DailyDeanes

ทุกที่ทุกเวลา #ไมค์จิ้ม #ไมค์ใหญ่มาก 😂 🎤 #แมทธิว #matthew #DailyDeanes

ด่วน! "สจ.โต้ง" ถูกยิงเสียชีวิต เหตุคนร้ายบุกยิงบ้านอดีต รมช.ศึกษาฯ | 11 ธ.ค. 67 | ไทยรัฐนิวส์โชว์

ด่วน! "สจ.โต้ง" ถูกยิงเสียชีวิต เหตุคนร้ายบุกยิงบ้านอดีต รมช.ศึกษาฯ | 11 ธ.ค. 67 | ไทยรัฐนิวส์โชว์

ยังไม่แจ้งข้อหา มือยิง "สจ.โต้ง" เสียชีวิตในบ้าน “สุนทร วิลาวัลย์” | วันใหม่ไทยพีบีเอส | 12 ธ.ค. 67

ยังไม่แจ้งข้อหา มือยิง "สจ.โต้ง" เสียชีวิตในบ้าน “สุนทร วิลาวัลย์” | วันใหม่ไทยพีบีเอส | 12 ธ.ค. 67

🔴LIVE สิงคโปร์ vs กัมพูชา | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024 | รอบแรก กลุ่ม A

🔴LIVE สิงคโปร์ vs กัมพูชา | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024 | รอบแรก กลุ่ม A

ว้าแดงล้ำไทย กับเรือประมงถูกยิง ปมปัญหาความมั่นคงไทย-เมียนมา อะไรคือทางออก | GLOBAL FOCUS #102

ว้าแดงล้ำไทย กับเรือประมงถูกยิง ปมปัญหาความมั่นคงไทย-เมียนมา อะไรคือทางออก | GLOBAL FOCUS #102

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cat mode and a glass of water #family #humor #fun

Cat mode and a glass of water #family #humor #fun

没想到出去了一会儿老公就干出这种事！幸好还有这个可以陪着我！#funny#萌娃#夫妻#剧情

没想到出去了一会儿老公就干出这种事！幸好还有这个可以陪着我！#funny#萌娃#夫妻#剧情