- 37
- 131 444
Data Science with Marco
Canada
เข้าร่วมเมื่อ 28 ม.ค. 2014
A channel dedicated to teaching real-world data science skills! Learn the theory and apply it in real projects to build your portfolio and become a better data scientist!
Follow me on Medium for more hands-on data science articles: medium.com/@marcopeixeiro
Follow me on Medium for more hands-on data science articles: medium.com/@marcopeixeiro
Podcast - TimeGPT, predicting the future, and more
Links 🔗
Full episode available here: th-cam.com/video/TbMBXKuU8hU/w-d-xo.html
Master time series forecasting with my online course: www.datasciencewithmarco.com/offers/zTAs2hi6/checkout
I had a great talk with @JackRoycroftSherry on time series, how to predict them, what works and what does not. Of course, we talked about TimeGPT, what it means for the field of forecasting.
We also diverge into NLP, as a lot of parallels can be made between time series and natural language, but the models for one don't always work well for the other!
Full episode available here: th-cam.com/video/TbMBXKuU8hU/w-d-xo.html
Master time series forecasting with my online course: www.datasciencewithmarco.com/offers/zTAs2hi6/checkout
I had a great talk with @JackRoycroftSherry on time series, how to predict them, what works and what does not. Of course, we talked about TimeGPT, what it means for the field of forecasting.
We also diverge into NLP, as a lot of parallels can be made between time series and natural language, but the models for one don't always work well for the other!
มุมมอง: 392
วีดีโอ
Anomaly detection in time series with Python | Data Science with Marco
มุมมอง 35Kปีที่แล้ว
A hands-on lesson on detecting outliers in time series data using Python. Full source code: github.com/marcopeix/youtube_tutorials/blob/main/YT_02_anomaly_detection_time_series.ipynb Dataset can be found here: github.com/numenta/NAB/blob/master/data/realAWSCloudwatch/ec2_cpu_utilization_24ae8d.csv Labels can be found here: github.com/numenta/NAB/blob/master/labels/combined_labels.json Chapters:...
Feature selection in machine learning | Full course
มุมมอง 27Kปีที่แล้ว
Full source code on GitHub: github.com/marcopeix/youtube_tutorials/blob/main/YT_01_feature_selection.ipynb Introduction - 0:00 Initial code setup - 2:19 Variance threshold - 11:04 Variance threshold (code) - 13:02 Filter method - 19:39 Filter method (code) - 21:27 RFE - 29:08 RFE (code) - 30:42 Boruta - 37:12 Boruta (code) - 41:21 Thank you - 46:35 A full course on feature selection in machine ...
Should you aim for data science or data engineering? | Data Science Q&A #1
มุมมอง 2492 ปีที่แล้ว
Weekly recap of the questions I answered about data science! Question 1: Why is SQL important when Python and R exist? (0:00) Question 2: How common is R in the industry compared to Python (0:38) Question 3: Should I aim for data science or data engineering? (1:35) Question 4: I have crossed the beginner stage of data science. How do I go deeper? (2:30) Question 5: Should I add certificates to ...
ARMA Model - Time Series Analysis in Python and TensorFlow
มุมมอง 9K3 ปีที่แล้ว
ARMA Model - Time Series Analysis in Python and TensorFlow
Autoregressive Process - Applied Time Series Analysis in Python and TensorFlow
มุมมอง 2.1K3 ปีที่แล้ว
Autoregressive Process - Applied Time Series Analysis in Python and TensorFlow
Moving Average Process - Applied Time Series Analysis in Python and TensorFlow
มุมมอง 1.9K3 ปีที่แล้ว
Moving Average Process - Applied Time Series Analysis in Python and TensorFlow
Random Walk Model - Applied Time Series Analysis in Python and TensorFlow
มุมมอง 4.2K3 ปีที่แล้ว
Random Walk Model - Applied Time Series Analysis in Python and TensorFlow
Stationarity and Differencing - Applied Time Series Analysis in Python and TensorFlow
มุมมอง 1.6K3 ปีที่แล้ว
Stationarity and Differencing - Applied Time Series Analysis in Python and TensorFlow
Autocorrelation and White Noise - Applied Time Series Analysis in Python and TensorFlow
มุมมอง 5K3 ปีที่แล้ว
Autocorrelation and White Noise - Applied Time Series Analysis in Python and TensorFlow
Basic Statistics - Applied Time Series Analysis in Python and TensorFlow
มุมมอง 1.4K3 ปีที่แล้ว
Basic Statistics - Applied Time Series Analysis in Python and TensorFlow
What Are Time Series - Applied Time Series Analysis in Python and TensorFlow
มุมมอง 2.3K3 ปีที่แล้ว
What Are Time Series - Applied Time Series Analysis in Python and TensorFlow
Data Science Portfolio Project: Regression #2 | Data Science with Marco
มุมมอง 2.1K4 ปีที่แล้ว
Data Science Portfolio Project: Regression #2 | Data Science with Marco
Data Science Portfolio Project: Regression #1 | Data Science with Marco
มุมมอง 4.9K4 ปีที่แล้ว
Data Science Portfolio Project: Regression #1 | Data Science with Marco
Unsupervised Learning | PCA and Clustering | Data Science with Marco
มุมมอง 11K4 ปีที่แล้ว
Unsupervised Learning | PCA and Clustering | Data Science with Marco
Suppor Vector Machine (SVM) in Python | Data Science with Marco
มุมมอง 8454 ปีที่แล้ว
Suppor Vector Machine (SVM) in Python | Data Science with Marco
Decision Trees | Data Science with Marco
มุมมอง 6534 ปีที่แล้ว
Decision Trees | Data Science with Marco
Resampling and Regularization | Data Science with Marco
มุมมอง 1.2K4 ปีที่แล้ว
Resampling and Regularization | Data Science with Marco
Classification in Python | logistic regression, LDA, QDA | Data Science With Marco
มุมมอง 9K4 ปีที่แล้ว
Classification in Python | logistic regression, LDA, QDA | Data Science With Marco
Linear Regression in Python | Data Science with Marco
มุมมอง 2.6K4 ปีที่แล้ว
Linear Regression in Python | Data Science with Marco
Than you very much for the video
this is great! about how long did it take to do boruta for your dataset? like if i have 400 features and 1 million rows.. would that be impossible?
@@edzme possible, but it will take a while, for sure.
Thanks, good review
This is great stuff. Really builds up the perspective. Can we continue on this kind of competitive problems so as to build a solid foundation for solving Kaggle projects. !!
This is extremely helpful and informative. Thanks a LOT!
I will be making an hourly passenger count forecast using LSTM time series model with 6-7 parameters. Can I choose the parameters as you did here?
I want to make LSTM time series, what should I do for this? I think the situation is different for time series. Would I be wrong if I use what you did? There is both trend and seasonality in the series.
Amazing video and excelent didatic. Congrats for the great quality, helped me a lot!
Seriously this channel is amazing, you deserve so many more subscribers man!
@@purecheese9012 Thanks for the kind words! Appreciate it!
Wait, just realized you are such a small TH-camr. Thought you would have at least 200,000 subscribers with this quality video. Explaining everything in depth and very understandable with very helpful and educational videos!
@@exstream_play9144 I wish haha!
Marco's the man!
Very interesting explanation and clear to understand. I was looking for this kind of tutorial. Subscribed👍
I like the logic of this video. You showed the baseline, then three additional methods, then compare them in the end. Thanks a lot for sharing the technique. The feature/target matrix is also very helpful. My question is the principle or concept behind the filter method, RFE, and boruta. Is it possible to do a video on them?
subscribed
Hugely informative and educational content. Many feature engineering videos are not that instructive.
Hello!! quick question, why is the threshold 3.5 any reason please?
It was great! Thanks for sharing your knowledge. Hope to see more of you.
Do you have LinkedIn? Could I follow you? : )
Please do more Data science-related content, It was very helpful I searched everywhere for feature selection videos and finally landed on this video and this was all I needed, the content is awesome and the explanation is as well!
I am a noob to data science and feature selection. Yours is the most succinct and clear lesson I have found... Thank you!
I thought feature selection is done before model training. Am I wrong?
Yes correct
Thank you so much, you are my life saver !!
how about random cut forest ?
Very interesting content, thank you!
Thank you for sharing
Sensational video, thank you so much!
nice and clear
Dear Marco Thank you.😀
Hi! Do you recomend any video for pattern-wise anomaly detection?
I don't know any, but you can look at the library TOAD for anonaly detection in time series. They do pattern-wise detection if I remember well
Hello Marco, thank you so much for such a great video. Can you please make a video on anomaly detection for time series data using pycaret.
Awesome video
Thanks!
Excellent video, however I'm preoccupied trying to figure out if having wine as a gas would make dinner parties better or worse. 🤔
Thank you! It's helpful!
Glad it helped!
Merci Marco pour le partage !
🎉 thank you a lot
Hi Marco!! Thank you so much for making great videos on "Anomaly detection". Great Great work! Please keep sharing! 🙏🙏🙏🙏
in Variance threshold technique, if we use Standard scaler instead of Minmax scaler, the variance would be the same for all variables.... does it means we can eliminate this step and just use standars scaler?
Wow, this video is really helpful, a lot of interesting methods were shown. Thanks a lot. I like to ask you to make a future video covering how you perform feature engineering and model fine tuning 1:49
pretty helpful!
Thank you very much for your work!
very helpful video and easy way to explain the content. thanks alot
Anomaly detection is unsupervised, how did you get to if a point is anomaly or not, even before training the model ?
The dataset is labeled. That way, we can measure the performance of each anomaly detection methods.
We got a few positive labels in cross validation
Very clear in very short VDO!!!!
Thanks for this valuable work. Helps me learning the subject.
Really great content! Learnt a lot. Thanks for your hard work!
This is an incredibly helpful video. One thing I noticed is that all features are numerical. How do we approach feature selection with a mix of numerical and categorical features? Also, when we have categorical features, do we first convert them to numerical features or first do feature selection. A video on this would be really helpful. Thank you
You will need to convert the categorical features into numerical format by using label encoding which automatically converts it to numerical values or custom mapping where u can manually assign ur preferred values to the features. I hope it helps
You will have to do the conversion before feature selection because machine learning models only learn from numerical data
Hi Marco! I'm working on a project and this has a lot of components I need. I noticed the specification of the data said that it was being recorded every 5 minutes, could you create a tutorial on how to retrieve a stream of live data and pass it to the algorithm in a somewhat real-time fashion? I hope this is similar to what I understood from your data collection in the video
Hi I wanted to work on the same thing, did you get anything?
Great explanation. Easy hands-on as well!!
Thank you!
Hey !, Is it possible to identify and flag anomalies within a continuous numerical attribute?
If by continuous, you mean at a very high frequency, then yes, I don't see why not!
Thanks !, If possible, can you make a video on that, it would be really helpful !@@datasciencewithmarco