Feature selection in machine learning | Full course

Data Science with Marco

มุมมอง 30 948

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 27 ธ.ค. 2024

ความคิดเห็น • 54

@tanyaalexander1460 8 หลายเดือนก่อน
I am a noob to data science and feature selection. Yours is the most succinct and clear lesson I have found... Thank you!
@ax5344 6 หลายเดือนก่อน ⁺⁴
I like the logic of this video. You showed the baseline, then three additional methods, then compare them in the end. Thanks a lot for sharing the technique. The feature/target matrix is also very helpful.
My question is the principle or concept behind the filter method, RFE, and boruta. Is it possible to do a video on them?
@abhinavreddy6451 8 หลายเดือนก่อน
Please do more Data science-related content, It was very helpful I searched everywhere for feature selection videos and finally landed on this video and this was all I needed, the content is awesome and the explanation is as well!
@mauroSfigueira 7 หลายเดือนก่อน ⁺¹
Hugely informative and educational content. Many feature engineering videos are not that instructive.
@samuelliaw951 ปีที่แล้ว ⁺²
Really great content! Learnt a lot. Thanks for your hard work!
@ednaldogomes6124 2 หลายเดือนก่อน
Congratulations and thanks for the excellent tutorial. You gained another subscriber👏👏👏
@pedro_tonom 5 หลายเดือนก่อน
Amazing video and excelent didatic. Congrats for the great quality, helped me a lot!
@babakheydari9689 7 หลายเดือนก่อน
It was great! Thanks for sharing your knowledge. Hope to see more of you.
@NulliusInVerba8 4 หลายเดือนก่อน
This is extremely helpful and informative. Thanks a LOT!
@mandefrolegesse5748 6 หลายเดือนก่อน
Very interesting explanation and clear to understand. I was looking for this kind of tutorial. Subscribed👍
ปีที่แล้ว ⁺¹
I am currently reading your book and it's amazing
@jmagames2766 ปีที่แล้ว
what is the name of the book plz
@oluwasegunodunlami7360 ปีที่แล้ว ⁺¹
Wow, this video is really helpful, a lot of interesting methods were shown. Thanks a lot.
I like to ask you to make a future video covering how you perform feature engineering and model fine tuning 1:49
@paramvirsaini2806 ปีที่แล้ว
Great explanation. Easy hands-on as well!!
@datasciencewithmarco ปีที่แล้ว
Thank you!
@marco_6145 9 หลายเดือนก่อน
Sensational video, thank you so much!
@claumynbega1670 ปีที่แล้ว
Thanks for this valuable work. Helps me learning the subject.
@lecturesfromleeds614 6 หลายเดือนก่อน
Marco's the man!
@maythamsaeed533 ปีที่แล้ว
very helpful video and easy way to explain the content. thanks alot
@shwetabhat9981 ปีที่แล้ว
Woah , much awaited 🎉 . Thanks for all the efforts put in sir . Looking forward to more such amazing content 🙂
@chiragsaraogi363 ปีที่แล้ว ⁺¹
This is an incredibly helpful video. One thing I noticed is that all features are numerical. How do we approach feature selection with a mix of numerical and categorical features? Also, when we have categorical features, do we first convert them to numerical features or first do feature selection. A video on this would be really helpful. Thank you
@haleematajoke4794 ปีที่แล้ว
You will need to convert the categorical features into numerical format by using label encoding which automatically converts it to numerical values or custom mapping where u can manually assign ur preferred values to the features. I hope it helps
@haleematajoke4794 ปีที่แล้ว ⁺¹
You will have to do the conversion before feature selection because machine learning models only learn from numerical data
@tongji1984 10 หลายเดือนก่อน
Dear Marco Thank you.😀
@scott6571 11 หลายเดือนก่อน
Thank you! It's helpful!
@datasciencewithmarco 11 หลายเดือนก่อน
Glad it helped!
@mohammadhegazy1285 3 หลายเดือนก่อน
Than you very much for the video
@edzme 4 หลายเดือนก่อน
this is great! about how long did it take to do boruta for your dataset? like if i have 400 features and 1 million rows.. would that be impossible?
@datasciencewithmarco 4 หลายเดือนก่อน
@@edzme possible, but it will take a while, for sure.
@DharmapuriSangeetha 18 วันที่ผ่านมา
thank you for clean insides can you list out all your books
@datasciencewithmarco 18 วันที่ผ่านมา
I only have one book for now: Time Series Forecasting in Python. I'm writing another one right now: Time Series Forecasting Using Foundation Models (cool stuff!!)
@berlinbrown03 4 หลายเดือนก่อน
Thanks, good review
@eladiomendez8226 10 หลายเดือนก่อน
Awesome video
@datasciencewithmarco 10 หลายเดือนก่อน
Thanks!
@alfathterry7215 ปีที่แล้ว
in Variance threshold technique, if we use Standard scaler instead of Minmax scaler, the variance would be the same for all variables.... does it means we can eliminate this step and just use standars scaler?
@imadsaddik 9 หลายเดือนก่อน
Thank you for sharing
@TheSerbes 5 หลายเดือนก่อน
I want to make LSTM time series, what should I do for this? I think the situation is different for time series. Would I be wrong if I use what you did? There is both trend and seasonality in the series.
@roba_trend ปีที่แล้ว
interesting content much love it🥰
@cagataydemirbas7259 ปีที่แล้ว ⁺¹
Hi, when I use randomforest , DecisionTree and xgboost on RFE , even if all of them tree based models, they returned completely different orders. On my dataset has 13 columns, on xgboost one of feature importance rank is 1, same feature rank on Decisiontree is 10, an same feautre on Randomforest is 7. How can I trust wich feature is better than others in general purpose ? İf a feature is better predictive than others, shouldnt it be de same rank all tree based models ? I am so confused about this. Also its same on SquentialFeatureSelection
@datasciencewithmarco ปีที่แล้ว
That's normal! Even though they are tree-based, they are not the same algorithm, so ranking will change. To decide on which is the best feature set, you simply have to predict on a test set and measure the performance to make a decision.
@mrthwibble 11 หลายเดือนก่อน
Excellent video, however I'm preoccupied trying to figure out if having wine as a gas would make dinner parties better or worse. 🤔
@roba_trend ปีที่แล้ว ⁺¹
i tried to search under your github aint get the data where is the data you work on?
@datasciencewithmarco ปีที่แล้ว
The dataset comes from the scikit-learn library! We are not reading a CSV file. As long as you have scikit-learn installed, you can get the same dataset! That's what we do in cell 3 of the notebook and it's also on GitHub!
@deniz-gunay 25 วันที่ผ่านมา
hi, thanks for the content 🎉 i have a dataset of 62 features and 40k rows and im using RFE. but rfe takes so much time. 90 minutes passed but it is still working. is there a problem? is this normal? what do you think?
@datasciencewithmarco 25 วันที่ผ่านมา ⁺¹
It's normal, as by default, it will use all features and remove them one at a time. You can increase the value of "step" or set it to a float between 0 and 1 to express a percentage. That controls how many features are removed at each iteration and should speed things up. So, if you set "step" to 5, then it removes 5 features at every iteration. If you set it to 0.05, then it removes 5% of all features at each iteration.
@deniz-gunay 25 วันที่ผ่านมา ⁺¹
@@datasciencewithmarco thanks man! i have fixed it. i used mutual information to decide the number of features and RFE to select the best features.
@pooraniayswariya997 ปีที่แล้ว
Can you teach how to do MRMR feature selection in ML?
@dorukucar7105 ปีที่แล้ว
pretty helpful!
@nikhildoye9671 9 หลายเดือนก่อน
I thought feature selection is done before model training. Am I wrong?
@keerthana7353 9 หลายเดือนก่อน
Yes correct
@therevolution8611 ปีที่แล้ว
can you explain how we are performing feature selection for the multilabel problem?
@BehindClosedDoorsBCD 10 หลายเดือนก่อน
You can convert the label to numerical features by replacing them with numbers. If you have 3 labels in a feature, you could represent them with 0,1,2 there are different methods to use. Simpler one is .replace({})
@nabeel_kaleel 6 หลายเดือนก่อน
subscribed

ต่อไป

เล่นอัตโนมัติ

Anomaly detection in time series with Python | Data Science with Marco