This was extremely helpful!! Between my 3 econometrics textbooks (Griffiths, Greene, and Wooldridge), the information on MA models was sparse. This really cleared up the mindset behind this model!
Thank you so much for explaining this so well! My professor and textbook explain this concept very mathematically which is hard to understand for beginners, they should really give a simple example and then dive into the details as you did.
Gemini 1.5 Pro: This video is about moving average model in time series analysis. The speaker uses a cupcake example to explain the concept. The moving average model is a statistical method used to forecast future values based on past values. It is a technique commonly used in time series analysis. The basic idea of the moving average model is to take an average of the past observations. This average is then used as the forecast for the next period. There are different variations of moving average models, and the speaker introduces the concept with moving average one (MA1) model. In the video, a grad student is used as an example. The grad student needs to bring cupcakes to a professor's dinner party every month. The number of cupcakes the grad student should bring is the forecast. The professor is known to be crazy and will tell the grad student how many cupcakes he thinks were wrong each month. This is the error term. The moving average model is used to adjust the number of cupcakes the grad student brings based on the error term from the previous month. The coefficient is a weight given to the error term. In the example, the coefficient is 0.5, meaning the grad student will adjust the number of cupcakes he brings by half of the error term from the previous month. For example, if the grad student brings 10 cupcakes in the first month, and the professor says the grad student brought 2 too many, then the grad student will bring 9 cupcakes in the second month (10 cupcakes - 0.5*2 error term). The video shows how the moving average model works through a table and graph. The speaker also mentions that there are other variations of moving average models, such as moving average two (MA2) model, which would take into account the error terms from two previous months.
I still don't think this makes sense to me why is incorporating past error somehow gives us better prediction in the future in this case. Since this crazy professor will randomly choose an acceptable # of cupcakes, your past error shouldn't help in better predicting in the future.
Event though the professor selects a different number every time, at the end the average is stable. Assume you have a time series of images. Images, due to the unstable environment they're taken in or all other factors that manipulate images nature, are not always the same, although they are taken from the same scene. So, what is the goal here ?to find the mutual information in the images and ignore the noises. These noises are how crazy professor is , and the importance of error, which we can handle by its coefficient. By handling these factors, we can get close to recognising the mutual information. Remember, these are unsupervised models. There are no lable to rely on.
the idea is that you're trying to predict the next value. you get told what the next value is by the professor. if its random then there is no signal in there & the results are still meaningless
How is the average moving though? It was fixed for each prediction! Wouldn't it have to be recalculated each time for it to be moving? Also we didn't seem to use anything related to the error being normally distributed... is there a reason for that? why was it mentioned in the first place?
How come some MA(1) formulas have x_t = mu + (phi1) error_t + (phi2) error_t-1..... If you predicting at time t then how would you know error at time t (error_t), why are some formulas like this?
You have amazing content, and it is very well explained. I love trading and thankfully I was introduced to this awesome trader Brian Branum. He has a trading system that is truly splendid. I enjoy good wins trading with him because he always manages my risks properly. Hence, I always receive my profits into my bitcoin wallet.
Great videos, thank you! I have a question. Period 1 value is our mean value but we don't know what is mean since we just started from point 0. How to calculate residual then? We know the true observation and we don't know the mean. Is it just a guess? But when we use any statistical package it does not ask us to input guess mean value.
Why in some models the prediction (f hat) is the average of the previous f values. But in some models, it is the error of the previous models that predict f hat.
If a physics student is reading this, just wanna share my intution that this is exactly like a control system . whatever error our model is getting, it is moving to cover it , little bit like PI controller in Electrical engineering :) not sure if it clicks to anyone
Where does the noise in the equation come from? In our data we only have time on the x axis and Y as the target variable. There is no error term. What I mean to ask is does the MA model first regress y on y lag terms like the AR model and then calculate error between the actual and predicted y terms? Then regress y against the calculated error terms(residuals)?
The error is a white noise coming from random shocks whose distribution is iid~(0,1). Ftting the MA estimates is more complicated than it is in autoregressive models (AR models), because the lagged error terms are not observable. This means that iterative non-linear fitting procedures need to be used in place of linear least squares. Hope this helps :).
what is the difference between taking the average of first 3 values and calculating the centered average at time period 2 and this method(average+error t+ error at previous time period)
This is a great explanation but in many equation they also add the current error (epsilon_t). I just don't get how are we supposed to know our current error if we are trying to forecast a value. Do we simply neglect that current equation for forecasting?
i wish my professor had explained it exactly like u just did
Thank you very much for making a vague concept so clear.
This was extremely helpful!! Between my 3 econometrics textbooks (Griffiths, Greene, and Wooldridge), the information on MA models was sparse. This really cleared up the mindset behind this model!
Never seen a better explanation of MA models. Immediate subscription!
Same here! I knew I would suscribe after 1 minute in the video. Very clear and very useful video. Thank you very much.
Oh damm!! this is wonderful, Simplified and explained pretty nicely. Keep spreading you knowledge!!
Thank you! Will do!
Thank you so much for explaining this so well! My professor and textbook explain this concept very mathematically which is hard to understand for beginners, they should really give a simple example and then dive into the details as you did.
Glad it helped!
I was stuck where is the “error" term coming from. Now I know... it is the error from the past. You explained! I wish you were my professor.
This men's explanation is way better than those profs at University.
God Bless You! I needed a fast way to get some concepts on time series forecasting and you saved me.
Easy, Fast, Complete.
I really don't know how to thank you for that great demonstration! I've been trying to understand MA process for years!
This was the best video on MA. The crazy prof made our life easier 😂😂😂
Wow! Great explanation. The professor´s example was very intuitive. Thanks for the content!
Thank you Sir. You have a great way of explaining things, something I sadly rarely find from my coding/statistics teachers.
You are spectacularly GOOD in the explanation of the ARIMA! Cheers
I appreciate that!
Thank you so much for making this fun video! Makes so much more sense now (after struggling through my not-so-crazy professor's stats class)
Couldn't be expressed so handsomely! Thanks!
So simple yet easy to understand. Thank you!
Thank you so much for your very intelligent explanation to this model!!! i felt so confused about this model before.
This explanation gives better understanding why do we need avoid unit root in Time Series predictions
Thank you so much, I have been reading this concept in an Econometric book...but this is easy to comprehend
Glad it was helpful!
Great explanation! I've learned everything that I looked for. Thank you.
a year trying to understand this, and I ve just needed 15 minutes thx!!
I was terrified for the mathematical symbols, but you made it so easy to understand! thank you!
ALWAYS GRATEFUL, THANK YOU FOR THE WONDERFUL CONTENT
Fantastic, got too caught up in the math in my macroeconometrics course and had no idea what these things actually were. Super helpful conceptually
Finally ❤️ a video with an applicable and relevant example ❤️🙏
Simple Explanation is a Talent - Thanks for this
Simple and clear explanation, thank you !
Thanks for existing in this world bro.
So nice of you
Finally understood this, thank you so much. Highly recommend!
Gemini 1.5 Pro: This video is about moving average model in time series analysis. The speaker uses a cupcake example to explain the concept.
The moving average model is a statistical method used to forecast future values based on past values. It is a technique commonly used in time series analysis.
The basic idea of the moving average model is to take an average of the past observations. This average is then used as the forecast for the next period. There are different variations of moving average models, and the speaker introduces the concept with moving average one (MA1) model.
In the video, a grad student is used as an example. The grad student needs to bring cupcakes to a professor's dinner party every month. The number of cupcakes the grad student should bring is the forecast. The professor is known to be crazy and will tell the grad student how many cupcakes he thinks were wrong each month. This is the error term.
The moving average model is used to adjust the number of cupcakes the grad student brings based on the error term from the previous month. The coefficient is a weight given to the error term. In the example, the coefficient is 0.5, meaning the grad student will adjust the number of cupcakes he brings by half of the error term from the previous month.
For example, if the grad student brings 10 cupcakes in the first month, and the professor says the grad student brought 2 too many, then the grad student will bring 9 cupcakes in the second month (10 cupcakes - 0.5*2 error term).
The video shows how the moving average model works through a table and graph. The speaker also mentions that there are other variations of moving average models, such as moving average two (MA2) model, which would take into account the error terms from two previous months.
OMG, this is brilliant , amazing ,wonderful ,thank you
Manyt thanks for your clear explanation of the mathematical moving average formula
of course!
Explained with the Cup Cakes it makes perfect sense, thumbs up!
Exceptionally useful videos for actuarial exams. Thanks for helping me pass🙂(hopefully)
Nice example super easy to understand the concept!
perfect explanation. Thank you!
I love this video, so simple but effective
I still don't think this makes sense to me why is incorporating past error somehow gives us better prediction in the future in this case. Since this crazy professor will randomly choose an acceptable # of cupcakes, your past error shouldn't help in better predicting in the future.
I think the student naively believes the crazy professor will stick to his prior t-1 position (the student is unaware of the professor's craziness)
Everything in time series assumes that you can use past info to predict future info
Event though the professor selects a different number every time, at the end the average is stable. Assume you have a time series of images. Images, due to the unstable environment they're taken in or all other factors that manipulate images nature, are not always the same, although they are taken from the same scene. So, what is the goal here ?to find the mutual information in the images and ignore the noises. These noises are how crazy professor is , and the importance of error, which we can handle by its coefficient. By handling these factors, we can get close to recognising the mutual information. Remember, these are unsupervised models. There are no lable to rely on.
Had I watched your series earlier would have saved me $3000 :(
Great explanation. Keep up the good work!
Let's use an example that is sligtly more natural to us -- so here's this crazy professor. :D
Great video! Thanks for sharing!
Thank you very much! Such a clear explanation!
Great video. I think the calculation of the 3rd row is wrong. It should've been 9+0.5 = 9.5
No.. Constant term is 10 not 9
Thanks man. You're doing a suberb job.
Does MA model assume et (lagged residuals) are pure white noise ? Mean =0, constant variance , and no autocorrelation of residuals ?
How do we know what the "error" is there is if there is no "true value" given a random realization of data.
the idea is that you're trying to predict the next value. you get told what the next value is by the professor. if its random then there is no signal in there & the results are still meaningless
How is the average moving though? It was fixed for each prediction! Wouldn't it have to be recalculated each time for it to be moving?
Also we didn't seem to use anything related to the error being normally distributed... is there a reason for that? why was it mentioned in the first place?
Exactly right, I am also having same query, Average not moving
Did you get any other source where this explained clearly
LOVE IT. Thank you.
Of course!
Brilliant explanation, thank you!
this is really helpful and so easy to understand!!!
Thanks!!! Perfect explanation :)
Extremely well explained
Thanks you so much.
Greatly explain!!! Thanks
So not natural.. it is why you are so good in teaching
Great Presentation...
Glad you liked it!
Amazing explanation man
Excellent explanation
God Bless you.
how do we find the coefficient for the moving average model?
Algorithms use the entire time series to get as close as possible to the true value of the coefficient (often with a maximum likelihood estimator).
Amazing explaination
Thank you for the video, how should we choose the 0.5 coefficient in front of the error term from last period in the regression model?
Fantastic!
you are just amazing
thanks! Really helpful
Great explanation! Third row shouldn't it be 9.5 rather than 10.5?
No, 10+1/2=10.5
@@wenzhang5879 Yeah, got it. Thanks
Hey amazing Content Bravo !
Can you add to that a video talking about random walk ?
That would be great .
How come some MA(1) formulas have x_t = mu + (phi1) error_t + (phi2) error_t-1..... If you predicting at time t then how would you know error at time t (error_t), why are some formulas like this?
Hi, great explanation! One question, how do you guess the mu value (the average cupcake you bring) for the fist time?
how is it possible you can explain this stuff so easily!
Wonderful example.
thanks!
THANK YOU SO MUCH
Perfect!
You have amazing content, and it is very well explained. I love trading and thankfully I was introduced to this awesome trader Brian Branum. He has a trading system that is truly splendid. I enjoy good wins trading with him because he always manages my risks properly. Hence, I always receive my profits into my bitcoin wallet.
Trading without a broker or even a mentor is among the surest ways to lose everything in your trading account.
I would like to know this guy. At the very least, speak with him personally.
If you can get me a link to speak with him, I'd greatly appreciate it.
This trading strategy helps to manage losses/trades better.
I am certain you will have a positive learning and trading experience with him
Great videos, thank you! I have a question. Period 1 value is our mean value but we don't know what is mean since we just started from point 0. How to calculate residual then? We know the true observation and we don't know the mean. Is it just a guess? But when we use any statistical package it does not ask us to input guess mean value.
Thank you so much for making this video. I am so frustrate to understand this concept :(
sameee
Thank you❤❤❤
Hi. The mean of et is not 0. For time interval 5, you need to write -1.
You can see how the crazy professor gets hungrier month by month
damn u a real one for this
Why in some models the prediction (f hat) is the average of the previous f values. But in some models, it is the error of the previous models that predict f hat.
I have the same doubt, sometimes he added the half of the error to f ,and sometime to f-hat
you are too good
Really good explaination!
Maybe I'm stupid for asking this...
If one was to write an MA filter, how do you determine M?
If a physics student is reading this, just wanna share my intution that this is exactly like a control system . whatever error our model is getting, it is moving to cover it , little bit like PI controller in Electrical engineering :) not sure if it clicks to anyone
Or a thermostat.
Where does the noise in the equation come from? In our data we only have time on the x axis and Y as the target variable. There is no error term. What I mean to ask is does the MA model first regress y on y lag terms like the AR model and then calculate error between the actual and predicted y terms? Then regress y against the calculated error terms(residuals)?
The error is a white noise coming from random shocks whose distribution is iid~(0,1). Ftting the MA estimates is more complicated than it is in autoregressive models (AR models), because the lagged error terms are not observable. This means that iterative non-linear fitting procedures need to be used in place of linear least squares. Hope this helps :).
Well explained ❤
Thank you 🙂
what is the difference between taking the average of first 3 values and calculating the centered average at time period 2 and this method(average+error t+ error at previous time period)
What you are describing is MA smoothing, which is used to describe the trend-cycle of past data
Hello, thanks for this video, but i Wonder about \theta_0. Could it be something different than 1?
THANK YOU!!
You're welcome!
This is a great explanation but in many equation they also add the current error (epsilon_t). I just don't get how are we supposed to know our current error if we are trying to forecast a value. Do we simply neglect that current equation for forecasting?
thank you so much
does miu have to be a constant? can we use a rolling window to calculate the average? will this yield better predictions?
THANK you
You're welcome!
Sir please make videos on restricted Boltzmann machine
Perfect.
Great video. Do you always start with the mean as your first guess for f hat? Also, how do you fit an MA(q) model?
God-like!
Thanks 🙏