@@ritvikmath you helped me with understanding a journal paper that I was reading. The python coding is so helpful for me as a beginner in this domain. I can easily analyse my data as soon as possible. Your are a superb teacher. I do appreciate your work.
do you know how many papers I have read to understand those concepts and I couldn't, and here you come and made me not only understand them but be able to apply them is my thesis. thanks a lot.
after watching your videos it makes more sense that Data science is much about understanding Math's then after that we need to focus on coding part... awesome intrusion. thanks a lot
In India, we respect our teachers, elders by touching their feets and ask for their blessings. I feel like giving the same respect to you. Love from India ❤️
A simple explanation to understand AIC and BIC indeed. Thanks for that ritvik ! Can you please make a similar video to which gives a feel for, 1. log-likelihood. 2. significance of each evaluation parameters in different time series models.
this video couldn't have been posted at a better moment. Currently writing my thesis on Uber travel times modelling and can't figure out which model to select. pls marry me no homo.
My professor at NYU co-wrote a paper on developing the AICc (corrected AIC). Of course Bayes has better name recognition in stats. Great video as always!
Great video! Very helpful and thorough explanation. For BIC, why do we want a lower number of samples? Conceptually, I thought more data points makes a better model. Can’t wait to catch up on rest of your videos I have not seen yet Cheers!!
Yes, we want more samples in general, but they have to contribute to better fit (increasing log likelihood) more than log of number of samples in order to be "BIC" better. As mentioned in the video if you have two models with same log likelihood, obviously the one trained on 1k samples is better than the one trained on 1M samples (and getting the same loglikelihood).
Thank you so much Ritvik. Out of curiosity: as it pertains to time series, will you be covering Brownian motion and jump diffusions in future videos? Regardless - love your content!
On question, though: what should I interpret out of the AIC if I apply it to just my stationary time series (i.e. 1st diff of the original time series) ? That is, with no AR, MA or other models yet applied to it? Would it make any sense?
good question. AIC/BIC are metrics we use on *models* not on raw data itself. So use these metrics if you are trying to decide between many models on the same set of data.
Ritvik, since all models used the exact same amount of data points in the sample, the model with the lowest AIC would also be the model with the lowest BIC, correct? Does that mean that only the AIC is relevant when all models have the same amount of sample data?
Can there be a case where we have different model for AIC and BIC?. For instance, AR(6) gives the lowest AIC and AR(10) gives the lowest BIC. In this case which model will be taken into consider and why?
Hi! Very usefull video. What about if you make a video about this log likelihood. It seems not so intuitive and there is no much of material about this topic out there. Thanks!
If the purpose of the modelling is to perform predictions, is it not better to evaluate the model on its ability to make predictions (i.e. with sliding windows k fold cross validation)? Rather than appraising the model on it's fit on data that it has already seen
how you say the data is stationary when p value is zero? when p value is less than the value of 95% con int then we reject null hypothesis. so the data is not stationary
your null hypothesis would be the data is non-stationary, so if the p-value is less than 0.05 (5%), it means the null hypothesis is rejected and the data is stationary, so as here p-value is less than 5% , (zero) here then the data is stationary.
NotImplementedError: statsmodels.tsa.arima_model.ARMA and statsmodels.tsa.arima_model.ARIMA have been removed in favor of statsmodels.tsa.arima.model.ARIMA (note the . between arima and model) and statsmodels.tsa.SARIMAX. statsmodels.tsa.arima.model.ARIMA makes use of the statespace framework and is both well tested and maintained. It also offers alternative specialized parameter estimators. it says
Great job Ritvik. Seriously, you explain data science concepts EXTREMELY well.
Thanks a ton Gary!
@@ritvikmath you helped me with understanding a journal paper that I was reading. The python coding is so helpful for me as a beginner in this domain. I can easily analyse my data as soon as possible. Your are a superb teacher. I do appreciate your work.
you are the only one Indian lecturer and only the one econometrist, who I can understand with my bad english and my bed math background.
do you know how many papers I have read to understand those concepts and I couldn't, and here you come and made me not only understand them but be able to apply them is my thesis. thanks a lot.
after watching your videos it makes more sense that Data science is much about understanding Math's then after that we need to focus on coding part... awesome intrusion. thanks a lot
Bro, you are not only a great data scientist, but also a great teacher.
thank you!! I don;t know why professors in highly ranked university can not teach us like this. hats off!!
In India, we respect our teachers, elders by touching their feets and ask for their blessings. I feel like giving the same respect to you. Love from India ❤️
bro, your explanations are really smooth and easy to understand......
Man, I'm learning time series analysis and forecasting and you're helping me a lot !!! Thanks !!!!
Great to hear!
Hey King, you dropped this 👑
A simple explanation to understand AIC and BIC indeed. Thanks for that ritvik !
Can you please make a similar video to which gives a feel for,
1. log-likelihood.
2. significance of each evaluation parameters in different time series models.
thanks! please check out my max likelihood video here:
th-cam.com/video/VOIhswqFWVc/w-d-xo.html
I love you for teaching it so simply
Thank you man, I was waiting for this video. If you can make another video which explains AIC & BIC in details that would be extremely helpful.
Noted! Thanks :)
underrated af! thanks man
this video couldn't have been posted at a better moment. Currently writing my thesis on Uber travel times modelling and can't figure out which model to select. pls marry me no homo.
glad to help :)
Nice job getting to the point!
Glad it was helpful!
My professor at NYU co-wrote a paper on developing the AICc (corrected AIC). Of course Bayes has better name recognition in stats. Great video as always!
What a nice explanation! Was looking for that for so long.
Great explanation
Thank you, great stuff!
Fantastic!
Great video! Very helpful and thorough explanation.
For BIC, why do we want a lower number of samples? Conceptually, I thought more data points makes a better model.
Can’t wait to catch up on rest of your videos I have not seen yet
Cheers!!
Yes, we want more samples in general, but they have to contribute to better fit (increasing log likelihood) more than log of number of samples in order to be "BIC" better. As mentioned in the video if you have two models with same log likelihood, obviously the one trained on 1k samples is better than the one trained on 1M samples (and getting the same loglikelihood).
Thank you so much Ritvik. Out of curiosity: as it pertains to time series, will you be covering Brownian motion and jump diffusions in future videos? Regardless - love your content!
Thanks! And I will definitely look into those topics
Again! Amazing!
Great explanation.
Glad it was helpful!
Thank you very much!
You're welcome!
Thank you!
Great video.When should we use adjusted R square,AIB and BIC?
Hey, Ritvik! Thank you for the great content! Could you, please, make a video on State Space Models?
Great!
On question, though: what should I interpret out of the AIC if I apply it to just my stationary time series (i.e. 1st diff of the original time series) ? That is, with no AR, MA or other models yet applied to it? Would it make any sense?
good question. AIC/BIC are metrics we use on *models* not on raw data itself. So use these metrics if you are trying to decide between many models on the same set of data.
Ritvik, since all models used the exact same amount of data points in the sample, the model with the lowest AIC would also be the model with the lowest BIC, correct? Does that mean that only the AIC is relevant when all models have the same amount of sample data?
I have the same question.
Can there be a case where we have different model for AIC and BIC?. For instance, AR(6) gives the lowest AIC and AR(10) gives the lowest BIC. In this case which model will be taken into consider and why?
I'd be so glad if ritvikmath answers this question. This is exactly what I was thinking about.
Hi! Very usefull video.
What about if you make a video about this log likelihood.
It seems not so intuitive and there is no much of material about this topic out there.
Thanks!
Great suggestion! I've added it to my list :)
Thanks a lot. So I have repeated measures and my models are nested. Would you then recommend BIC? I appreciate your help :)
Why is AIC 2K-2L and not just K-L?
epic
hi there @ritvikmath I WANT To confirm onething, about the AIC formula, isen't ''AIC=2K-2ln(L)'' the correct one instead of AIC=2K-2L.
THANK YOU.
sir please explain ARCH GARCH model Assumption and limitation
If the purpose of the modelling is to perform predictions, is it not better to evaluate the model on its ability to make predictions (i.e. with sliding windows k fold cross validation)? Rather than appraising the model on it's fit on data that it has already seen
Please do a more theoretical video about log likelihood
Naice!!
could you do some of these in Rstudio?
Why is AR(10) chosen when AR(7) and AR(8) had more significant lags?
plz the book you used making those videos
what AIC and BIC stand for?
If they want to obtain the lowest of AIC, why don’t they represent the formula as k/l , they would give similar relationship
how you say the data is stationary when p value is zero? when p value is less than the value of 95% con int then we reject null hypothesis. so the data is not stationary
your null hypothesis would be the data is non-stationary, so if the p-value is less than 0.05 (5%), it means the null hypothesis is rejected and the data is stationary, so as here p-value is less than 5% , (zero) here then the data is stationary.
NotImplementedError:
statsmodels.tsa.arima_model.ARMA and statsmodels.tsa.arima_model.ARIMA have
been removed in favor of statsmodels.tsa.arima.model.ARIMA (note the .
between arima and model) and statsmodels.tsa.SARIMAX.
statsmodels.tsa.arima.model.ARIMA makes use of the statespace framework and
is both well tested and maintained. It also offers alternative specialized
parameter estimators. it says
Fantastic!