A severe misconception: ARCH / GARCH models are not used to model a change in the unconditional variance and are therefore not used for non-stationary series. Short volatility clusters as shown in your example series do not violate the idea of (weak) stationarity. Such volatility clusters are caused by a changing conditional variance, which can be modelled using ARCH (autoregressive conditional heteroskedasticity) and GARCH (generalized ARCH) models. Look it up. There are even commonly known conditions for the stationarity of ARCH / GARCH processes. Think about it like this: when we consider stationarity in the mean, we do not expect a time series to follow a straight line at a constant value. No, it is fine that it diverges from the overall mean before it returns to it within a short amount of time. Similarly, for stationarity in the variance, it is fine if the variation in the series diverges for a couple of observations, if the level of variation then returns to the overall level of variation.
What determines the window size you use? It it a standard number of timesteps? e.g. You could choose a window for the seasonal data that would make it stationary.
The theory would be that any size window should hold for stationarity. Now I would push back that you could select a window to make seasonal data stationary. This is because even if you picked a window that was the exact size of a season, you would lose the stationarity the moment you move this window one time period into the future and lose the season. For example, it isn't every 12 time periods, but a window of 12 time periods from every time point.
@@AricLaBarr That makes sense. Thank you for this excellent series. I was thinking of the seasonal data like a sine function with the window as one period. Shifting the window in time would be like a phase shift, which would maintain the same "distribution".
Just some clarifications. When you say "model the lack of consistency in variance", do you mean model the variance in a consistent way? When you say they are Lazy, do you mean they are using a method that has statistically incorrect properties for the sake of simplicity?
Happy to help! I mean that there are models to actually model variance, especially when it is changing over time. The methods aren't statistically incorrect in terms of the mean and will follow everything they need to predict the means (averages) well still.
Stationary data is (mostly) data that doesn't trend or have seasonality. Think of something like the year over year percentage change in population for a country. Hope this helps!
That is actually the fun part - it depends! Yes, the original correlations you saw in your data will be most likely different. However, those correlations were probably impacted by those trends and seasonality in a way that makes ARIMA models not work well since in the long run, those models always revert to a constant mean. So in a way, the differencing will reveal more of the actually modellable (by ARIMA standards) correlations in your data!
That is probably a result of over-differencing! If you take too many differences you could introduce even more problems into your data. You should only difference if you have a trend, season, or unit root.
The main difference between them is whether you think the process hovers around a specific value. That is mean stationarity. It never gets too far away above or below a specific value.
A lot of people have trouble seeing how seasonal data is non-stationary so you are not alone! Think about it this way. Stationary average means that at any point in time, the series can take (and actually reverts to over the long run) the average. This is actually never the case for seasonal data. Seasonal data only crosses the mean at specific points in the season, not ANY point in the season. The wave of seasonal data makes it impossible for any point in the series to be at the mean. Hope this helps!
why cant people on these or any other lectures explain why in the first place a stationary data is needed, they all are talking about have a stationary data but why should we have one
It is because of the structure of the models we are using. ARIMA models rely on stationarity because they rely on means reverting. Without stationarity, ARIMA models will have horrible forecasts because they mathematically revert to the mean whether your data does or not.
Slight variations in definitions from book to book. You need the two moments to exist for weak stationarity. But most definitions of strong stationarity don't delve into that since they demand equivalence at a deeper level. End result is neither implies the other.
Why not just deal with variance with log-return? We use it all the time with random walk models. Also why not give everyone the applied intuitions behind these statistical models, for example one purpose comes all down to isolating the seasonal indices over the overall smoothed trendline to make extrapolations upon a confidence band. I would really hope you've made a short video on that matter - because there's not a single textbook or scholarly article I know that actually has explained it in a way that even kids would understand it
Differencing and transformation are different. Log transformation is to stationize the variance, you can still have trend and seasonality with transformed data. Differencing is to eliminate trend and seasonality to stationize the mean.
I cannot even express how grateful I am for these videos.. they're so clear! Amazing job
Better than my professor's 2-hour lecture lol
Why is this channel so underrated?
Your videos have concept clarity far better than many prominent online study websites.
So far the best TS tutorials I have watched. Please keep it coming :)
Best LR vids on TH-cam, and that's saying a lot because there are many! Thank You :)
Wow, thanks! Glad you liked them!
The way you explain things is amazing! Thank you for these videos!
I wish my lectures would look like that! Thanks a lot
What a wonderful video. Much better explanation that I could have managed! Will definitely be recommending this series.
Your videos break down concepts in such a meaningful way! I hope you keep posting more!
You might be the only nerd with a good sense of humour. also thank you for the explanation
THAT WAS A GREAT EXPLANATION OF THE THEORY IN FEW MINUTES. THANKS ❤️❤️❤️❤️
again i am very very grateful to you , delivering so great content in such a short time , hats off
Best statistics video ever
Really thank you for the video 🙏 simple and best explanation about stationary so far. It really helped me getting started for forecasting.
Glad it was helpful!
The Best Tutorial!!!
You are a brilliant teacher.
Great video, help me understand the stationary concept1
Awesome explanation
At last, understood. Thanks Sir
Quick & clear ! thank you for the explanation.
Thank you!
Great stuff, keep it up!
Thank you! Glad you liked it!
Thanks for the video, it's just the way I like!
Thank you! Glad you liked it!
Great explanation, thank you!
loved it!!
This video is amazing
Excellent Video!
Thank you! Glad you liked it!
Bro God bless you!
Really amazing
Nice video, helped clear my concept.
Glad it helped!
very easily explained by you sir...thanks ... but the variance stationery part is not explained...
Awesome :)
Love it, thanks
Amazing explanation!
Thank you!
Wow awesome
Same means exactly equal or similar as we put in hypothesis testing i.e. statistically significant?
Statistically!
A severe misconception: ARCH / GARCH models are not used to model a change in the unconditional variance and are therefore not used for non-stationary series. Short volatility clusters as shown in your example series do not violate the idea of (weak) stationarity. Such volatility clusters are caused by a changing conditional variance, which can be modelled using ARCH (autoregressive conditional heteroskedasticity) and GARCH (generalized ARCH) models. Look it up. There are even commonly known conditions for the stationarity of ARCH / GARCH processes.
Think about it like this: when we consider stationarity in the mean, we do not expect a time series to follow a straight line at a constant value. No, it is fine that it diverges from the overall mean before it returns to it within a short amount of time. Similarly, for stationarity in the variance, it is fine if the variation in the series diverges for a couple of observations, if the level of variation then returns to the overall level of variation.
haha love it!
just super !
Amen
What determines the window size you use? It it a standard number of timesteps? e.g. You could choose a window for the seasonal data that would make it stationary.
The theory would be that any size window should hold for stationarity.
Now I would push back that you could select a window to make seasonal data stationary. This is because even if you picked a window that was the exact size of a season, you would lose the stationarity the moment you move this window one time period into the future and lose the season. For example, it isn't every 12 time periods, but a window of 12 time periods from every time point.
@@AricLaBarr That makes sense. Thank you for this excellent series.
I was thinking of the seasonal data like a sine function with the window as one period. Shifting the window in time would be like a phase shift, which would maintain the same "distribution".
What do u mean by location in time??
Literally where you are in the x-axis which is time itself!
Just some clarifications. When you say "model the lack of consistency in variance", do you mean model the variance in a consistent way?
When you say they are Lazy, do you mean they are using a method that has statistically incorrect properties for the sake of simplicity?
Happy to help! I mean that there are models to actually model variance, especially when it is changing over time. The methods aren't statistically incorrect in terms of the mean and will follow everything they need to predict the means (averages) well still.
you can pat your back sir
I would like you to give some real life examples of stationarity for my clarification on the topic . Still confused what is stationarity
Stationary data is (mostly) data that doesn't trend or have seasonality. Think of something like the year over year percentage change in population for a country. Hope this helps!
:)
Hello, thank very much for great video! Could you please help me to get the datasets used in this presentation? Thanks🙂
Most of the datasets are ones I created myself to get the right pattern for the slides!
so clear
Amazing
Thank you!
Dear Prof. Aric; won't differencing results in totally unrelated new values?
That is actually the fun part - it depends! Yes, the original correlations you saw in your data will be most likely different. However, those correlations were probably impacted by those trends and seasonality in a way that makes ARIMA models not work well since in the long run, those models always revert to a constant mean. So in a way, the differencing will reveal more of the actually modellable (by ARIMA standards) correlations in your data!
what is the distribution in time series analysis ?
What distribution are you looking for? Distribution of residuals from a model? Distribution of the statistical tests? There are many distributions :-)
How can I do the analysis??
There are a lot of great options in open source software like Python or R!
I double differenced and got constant variance explain please
That is probably a result of over-differencing! If you take too many differences you could introduce even more problems into your data. You should only difference if you have a trend, season, or unit root.
IM still having a problem understanding stationary and non stationary 😢
The main difference between them is whether you think the process hovers around a specific value. That is mean stationarity. It never gets too far away above or below a specific value.
Are you sure that seasonality makes a variable non-stationary? It doesn't feel right to me.
A lot of people have trouble seeing how seasonal data is non-stationary so you are not alone!
Think about it this way. Stationary average means that at any point in time, the series can take (and actually reverts to over the long run) the average. This is actually never the case for seasonal data. Seasonal data only crosses the mean at specific points in the season, not ANY point in the season. The wave of seasonal data makes it impossible for any point in the series to be at the mean.
Hope this helps!
@@AricLaBarr Thanks! That was an excellent explanation! 🙌
why cant people on these or any other lectures explain why in the first place a stationary data is needed, they all are talking about have a stationary data but why should we have one
It is because of the structure of the models we are using. ARIMA models rely on stationarity because they rely on means reverting. Without stationarity, ARIMA models will have horrible forecasts because they mathematically revert to the mean whether your data does or not.
😂😂😂😂😂😂u talk funny
Shameless pug?😢
sorry,in my book the strong stationary implies weak stationary,and weak stationary doesn't imply the strong one
Slight variations in definitions from book to book. You need the two moments to exist for weak stationarity. But most definitions of strong stationarity don't delve into that since they demand equivalence at a deeper level. End result is neither implies the other.
Why not just deal with variance with log-return? We use it all the time with random walk models. Also why not give everyone the applied intuitions behind these statistical models, for example one purpose comes all down to isolating the seasonal indices over the overall smoothed trendline to make extrapolations upon a confidence band. I would really hope you've made a short video on that matter - because there's not a single textbook or scholarly article I know that actually has explained it in a way that even kids would understand it
Differencing and transformation are different. Log transformation is to stationize the variance, you can still have trend and seasonality with transformed data. Differencing is to eliminate trend and seasonality to stationize the mean.