Always Estimate Accuracy on Independent Test Set
ฝัง
- เผยแพร่เมื่อ 7 ก.ย. 2024
- In this video, we discuss the problem of using the same data for training and testing in classification models. We demonstrate this issue and emphasize the importance of testing on fresh data.
This video is a part of Introduction to Data Science video series that dives into machine learning, visual analytics, and joys of interactive data analysis using Orange Data Mining software (orangedatamini...).
SUBSCRIBE to our channel: / orangedatamining
The development of this video series was supported by grants from the Slovenian Research Agency (including P2-0209, V2-2274, and L2-3170), Slovenia Ministry of Digital Transformation, European Union (including xAIM and ARISA) and Google.org/Tides foundation.
#machinelearning #orange #visualanalytics #datamining
__
Written by: Blaž Zupan (biolab.si/blaz)
Presented by: Noah Novšak
Production and edit: Lara Zupan
Intro/outro: Agnieszka Rovšnik
Music by: Damjan Jović - Dravlje Rec
Orange is developed by Biolab at University of Ljubljana (www.biolab.si)
AWESOME
Should I use the data sampler also with the test and score? Or the test and score automatically split my dataset when I use the random sampling option?
That's a great question. Test and Score would automatically split only for cross validation. For testing on an independent data set, use Data Sampler.
@@OrangeDataMining many thanks. Love your program. It's a great and easy start for learning data science