Simple and elegantly presented. Was working on a forecast and other descriptions online were abhorrent to say least. The error range was HUGE, but thanks to you my standard error reduced to 2%. Asante Sana!!!🤗
Hi David, thanks for the appreciation! We do not use December because the 12-th category is captured by the intercept, and is specified when all the 11 dummy variables are all set to zero. This is the general rule: one variable less then the number of categories. If you have just 2 categories (yes and no) you put just one dummy (yes).
Hello, we lock the Intercept because in the equation the intercept is only one, while the coefficients are multiple. Please read here about the p value: www.investopedia.com/terms/p/p-value.asp
Thank you for your great video! Do you think this work correctly with 12 months data historical and some month value = 0 as well ? I look forward to hearing your advices. Then how can we apply for multi sku products.
Hi Nab, thanks for the question. That is because the 12-th category is captured by the intercept, and is specified when all the 11 dummy variables are all set to zero. This is the general rule: one variable less then the number of categories. If you have just 2 categories (yes and no) you put just one dummy (yes).
Thank you for great content! What if we would like to add another variable - i.e., a change in product prices (let's assume cyclical price increases, as well as occasional promotions, for example, for a month of time)? How would your model then need to be modified?
Hi Michal, thank you for the question! If you have a one-off event, such as a price promotion, that you want to include in your Holt-Winters exponential smoothing model, there are a few different approaches you can take: 1. Include the event as a predictor in the model: If the event has a clear effect on the time series being forecasted, you can add it as an additional predictor in the model. For example, if you are forecasting sales and you have data on the price of the product, you can add a binary predictor to the model to indicate whether the promotion is occurring in a given month. 2. Adjust the trend component to account for the event: If the event has a more complex effect on the time series, you may need to adjust the trend component of the model to take it into account. For example, if the promotion is expected to have a significant impact on sales, you could adjust the trend component to reflect this. 3. Incorporate the event into the seasonality component: If the event has a seasonal effect on the time series, you can incorporate it into the seasonality component of the model. For example, if the promotion is expected to have a particularly strong impact on sales in a particular month, you could adjust the seasonality component for that month to reflect this. I hope this helps!
Hello! Don't know what I did wrong but when I tried to use the Regression tool in Excel's pack, it said "The number of rows and columns in X range cannot be the same."
Hello, these can be the causes: Case 2: The number of rows is less than the number of x-columns It is not statistically valid for the number of rows to be less than the number of x (variable) columns. The number of rows of data must be larger than the number of columns of data (x-columns plus y-columns). Case 3: You specify a zero constant Do not specify a zero constant (b=0) in the function.
Hello, please check these: support.microsoft.com/en-us/office/how-to-correct-a-num-error-f5193bfc-4400-43f4-88c4-8e1dcca0428b and these: The "NUM!" error in regression analysis usually occurs when there is a problem with the data in the cells that are being used for the analysis. This error message indicates that Excel is unable to perform the calculation because it is encountering a value that is not a number. There are a few reasons why you might be seeing this error message in your regression analysis. One possibility is that there are blank cells or cells with text in the range of data that you are using for the analysis. Another possibility is that there are cells with errors, such as #DIV/0! or #VALUE!, in the range of data. To fix this issue, you can try the following steps: Check the range of data that you are using for the analysis and make sure that there are no blank cells or cells with text in the range. Check for any cells with errors in the range of data and correct them if necessary. If you have deleted some fields of data, make sure that you have not accidentally deleted any cells that are being used in the analysis. Check the formatting of the cells in the range of data and make sure that they are all formatted as numbers. If none of the above steps work, try copying the data to a new worksheet and performing the analysis again.
Hello. It maybe from the relative/fixed cells references in the formula Mmult(). Please check what column/rows must be fixed and what relative. I hope this helps.
Yes, I spent 15 minutes confused about why I got it as well, and it's because you didn't fill out the binary numbers on the forecast sections, but only on the input before forecast that is
Sir what If we have Jan to March instead of Jan to Dev Because of I have value from Jan to March and when I am doing the method the out is not showing right could you please give me a hint
@@Data.Analytics.Central Hello, i want to ask if i'm going to forecast annual prices for 6 years (2023-2029) so the row for the dummy variable should be (2023-2028), right? pls kindly enlighten me, thank you very much!
@@nurul.alifiaa Hi, Nurul. The number of dummy variables are related to the number of seasons that you have in your data. So, if you have 4 seasons, you use 3 dummies. And then you can extrapolate on how many years you want. In my model I have forecasted only one extra year vs. the actual data, but we can drag down the formula for another 6 years to forecast.
@@Data.Analytics.Central aaaa so if i have datas from jan 2013-juli 2023, i can use jan-nov (11) dummies? i mean i can use the same formulas like yours (?) anyway thank you so much for your answer!!
@@nurul.alifiaa First you have to understand how many seasons you have. If it is 12 months, you will have 11 dummies, but but you need histroric data for at least 12 months, so you can train the model (it is not enough just jan23-july23
I have a data which follows the quadratic graph. 2nd order linear equation fit the data. If I apply this trend+seasonality method starting forecast (from period 1 to 6) getting negative.
@@Phenomenal1697 To clarify, the dummy variable method is typically used for linear regression models to represent categorical variables as binary variables (0 or 1). However, it is not directly applicable to quadratic models. In a quadratic regression model, you can still include categorical variables by converting them into numeric form (e.g., using integer encoding) and incorporating them as independent variables with quadratic terms if necessary. For example, if you have a categorical variable with three levels (A, B, C), you can use two numeric variables (e.g., 0 and 1) to represent this variable and then include both linear and quadratic terms for these numeric variables in the model. So, while the dummy variable method is not directly used in quadratic regression models, you can still include categorical variables in a modified numeric format and consider their linear or quadratic effects in the model.
Hi Sushma, You must create separate sets of dummy variables for different seasonality: - To capture day of the week seasonality, create 6 dummy variables. - To capture day of the month seasonality, create 30 dummy variables - To capture month of the year, create 11 dummy variables.
P value error #num and some very low coefficients for significance f 0.8465 f 0.56 p greater than 0.05. please do similar videos on different. Stat methods
Dear Catalin, Thank you for the appreciation! We can arrive at the same results as the ones I did in the video by simply using the Excel function FORECAST.ETS(). For example the forecasted value of 01-Jan-18 (380.432$ in cell D38) can be calculated like this: =FORECAST.ETS(A38,$C$2:$C$37,$A$2:$A$37,1,1,1) and the result would be 378.675$, very close to the manual method. The related function FORECAST.ETS.SEASONALITY() is just telling you what is the seasonality that the previous function FORECAST.ETS() was based on. If we aply it to our case: =FORECAST.ETS.SEASONALITY(C2:C37,A2:A37) we get the result 12, just what we have expected (12 months). I hope it helps!
Hello, thanks for the question. That is because the 12-th category is captured by the intercept, and is specified when all the 11 dummy variables are all set to zero. This is the general rule: one variable less then the number of categories. If you have just 2 categories (yes and no) you put just one dummy (yes).
thanks@@Data.Analytics.Central . All my p values are bigger than 0.05, even my f and r2 are good. Then I cannot use this way, right? finally im using =forecast.ets()
Dummy variables are useful because they allow us to include categorical variables in our analysis, which would otherwise be difficult to include due to their non-numeric nature.
Simple and elegantly presented. Was working on a forecast and other descriptions online were abhorrent to say least. The error range was HUGE, but thanks to you my standard error reduced to 2%. Asante Sana!!!🤗
Dear John, thank you very much for the appreciation!
Thanks a lot . This means a lot . God bless you !
Many thanks for your appreciation! 🙏
The best lecture, Thank you so much!
Thanks a lot the kind feedback!
Extremely helpful video. Thank you very much!
@@nairajbudnarine4728 Thanks a lot for the appreciation!
Great work! Solved a problem I have been working on for days.
Thank you Hussain for the appreciation!
Great content explained in detail! Amazing!
Dear Erick, thank you very much for your feedback!
u are the best bro thank you for the help🙏
Thanks a lot the kind feedback!
Best video ever! Thank you very much!
Thanks a lot for the appreciation! 🤝
I think residual should be zero or close to zero! right? but in your case it is too much; so, can we say your forecast is good?
Great video, is very usefull, thanks. i have a question, why dont use december when you transpose the months?
Hi David, thanks for the appreciation!
We do not use December because the 12-th category is captured by the intercept, and is specified when all the 11 dummy variables are all set to zero.
This is the general rule: one variable less then the number of categories. If you have just 2 categories (yes and no) you put just one dummy (yes).
thank you very much, this really helped me a lot!
Glad it helped!
Sir, could you please explain us why we have to lock 🔒 intercept, please explain and please explain me about p value.
Hello, we lock the Intercept because in the equation the intercept is only one, while the coefficients are multiple. Please read here about the p value:
www.investopedia.com/terms/p/p-value.asp
Thank you for your great video! Do you think this work correctly with 12 months data historical and some month value = 0 as well ? I look forward to hearing your advices.
Then how can we apply for multi sku products.
Hello, I look forward to hearing from you :)
Hello Sir
Where can I get the Excel worksheet to follow your presentation. Thanks
May I get this excel file to practice
Could you share the data file for Practice please ?
Thank you for the video, it is really helpful!
Thank you Merjen for the appreciation. I am glad you have found it useful!
Thanks for the video was very useful.
Thanks a lot for the appreciation! 🤝
would you mind explain, why to exclude the last month on your dummy variable?
Hi Nab, thanks for the question.
That is because the 12-th category is captured by the intercept, and is specified when all the 11 dummy variables are all set to zero.
This is the general rule: one variable less then the number of categories. If you have just 2 categories (yes and no) you put just one dummy (yes).
@@Data.Analytics.Central I see. It makes sense. thanks for the explanation sir. could we use this model for stock price as well?
@@sayednab It is available for any pattern that has seasonality.
what if the x variable is more than 1?
If I have to do the same thing but on a day level
How can I achieved that?
Thank you for great content!
What if we would like to add another variable - i.e., a change in product prices (let's assume cyclical price increases, as well as occasional promotions, for example, for a month of time)? How would your model then need to be modified?
Hi Michal, thank you for the question!
If you have a one-off event, such as a price promotion, that you want to include in your Holt-Winters exponential smoothing model, there are a few different approaches you can take:
1. Include the event as a predictor in the model: If the event has a clear effect on the time series being forecasted, you can add it as an additional predictor in the model. For example, if you are forecasting sales and you have data on the price of the product, you can add a binary predictor to the model to indicate whether the promotion is occurring in a given month.
2. Adjust the trend component to account for the event: If the event has a more complex effect on the time series, you may need to adjust the trend component of the model to take it into account. For example, if the promotion is expected to have a significant impact on sales, you could adjust the trend component to reflect this.
3. Incorporate the event into the seasonality component: If the event has a seasonal effect on the time series, you can incorporate it into the seasonality component of the model. For example, if the promotion is expected to have a particularly strong impact on sales in a particular month, you could adjust the seasonality component for that month to reflect this.
I hope this helps!
This’s really useful thank you! Could you please do a tutorial on that?
But how can we do daily. I tried but it show only 16 variable maximum, but we have 30 days
Excellent sir. Can we use the same process for 5 year forecast. Please do reply sir
Yes, for sure. If you have monthly seasonality
and also, why you used three years data? what will be wrong if I use two- or four-years data?
Hello! Don't know what I did wrong but when I tried to use the Regression tool in Excel's pack, it said "The number of rows and columns in X range cannot be the same."
Hello, these can be the causes:
Case 2: The number of rows is less than the number of x-columns
It is not statistically valid for the number of rows to be less than the number of x (variable) columns. The number of rows of data must be larger than the number of columns of data (x-columns plus y-columns).
Case 3: You specify a zero constant
Do not specify a zero constant (b=0) in the function.
When I used the Regression function, the numbers in the P-Value is #NUM! Any idea why?
Hello, please check these:
support.microsoft.com/en-us/office/how-to-correct-a-num-error-f5193bfc-4400-43f4-88c4-8e1dcca0428b
and these:
The "NUM!" error in regression analysis usually occurs when there is a problem with the data in the cells that are being used for the analysis. This error message indicates that Excel is unable to perform the calculation because it is encountering a value that is not a number.
There are a few reasons why you might be seeing this error message in your regression analysis. One possibility is that there are blank cells or cells with text in the range of data that you are using for the analysis. Another possibility is that there are cells with errors, such as #DIV/0! or #VALUE!, in the range of data.
To fix this issue, you can try the following steps:
Check the range of data that you are using for the analysis and make sure that there are no blank cells or cells with text in the range.
Check for any cells with errors in the range of data and correct them if necessary.
If you have deleted some fields of data, make sure that you have not accidentally deleted any cells that are being used in the analysis.
Check the formatting of the cells in the range of data and make sure that they are all formatted as numbers.
If none of the above steps work, try copying the data to a new worksheet and performing the analysis again.
@@Data.Analytics.Central I still get #NUM! on my P-values.
Is is usable for other figures like ebt, ooe, etc?
It applies to any metric that has seasonality and trend.
Hello. When using this method, Excel is showing the value function but I don't know what I did wrong. Any idea?
Hello. It maybe from the relative/fixed cells references in the formula Mmult(). Please check what column/rows must be fixed and what relative. I hope this helps.
SAME , DID U FIND THE ANSWER?
Yes, I spent 15 minutes confused about why I got it as well, and it's because you didn't fill out the binary numbers on the forecast sections, but only on the input before forecast that is
Sir what If we have Jan to March instead of Jan to Dev
Because of I have value from Jan to March and when I am doing the method the out is not showing right could you please give me a hint
Hello, I hope you follow the rule: one variable less then the number of categories, so in your case of 3 months you use 2 variables
Hi, Kindly share the csv file. Thank you
Hey, what if we have to forecast yearly sales. How many years would we need to enter as the dummy variable ?
Hi Rahul, the number of dummy variables is equal with the number of seasons minus 1. So if your seasonality is 5 years, you use 4 dummy variables.
@@Data.Analytics.Central Hello, i want to ask if i'm going to forecast annual prices for 6 years (2023-2029) so the row for the dummy variable should be (2023-2028), right? pls kindly enlighten me, thank you very much!
@@nurul.alifiaa Hi, Nurul. The number of dummy variables are related to the number of seasons that you have in your data. So, if you have 4 seasons, you use 3 dummies. And then you can extrapolate on how many years you want. In my model I have forecasted only one extra year vs. the actual data, but we can drag down the formula for another 6 years to forecast.
@@Data.Analytics.Central aaaa so if i have datas from jan 2013-juli 2023, i can use jan-nov (11) dummies? i mean i can use the same formulas like yours (?) anyway thank you so much for your answer!!
@@nurul.alifiaa First you have to understand how many seasons you have. If it is 12 months, you will have 11 dummies, but but you need histroric data for at least 12 months, so you can train the model (it is not enough just jan23-july23
so "t" is for trend and "jan" thru "nov" is for seasonality. am i correct
Exactly!
How we can analyse the forecast with second order linear function with seasonality.
Possible?
Hello, can you give more context to your question?
I have a data which follows the quadratic graph. 2nd order linear equation fit the data. If I apply this trend+seasonality method starting forecast (from period 1 to 6) getting negative.
Above method follow single order trend. is there same method which consider the period square term also.(2nd order)
@@Phenomenal1697 To clarify, the dummy variable method is typically used for linear regression models to represent categorical variables as binary variables (0 or 1). However, it is not directly applicable to quadratic models.
In a quadratic regression model, you can still include categorical variables by converting them into numeric form (e.g., using integer encoding) and incorporating them as independent variables with quadratic terms if necessary. For example, if you have a categorical variable with three levels (A, B, C), you can use two numeric variables (e.g., 0 and 1) to represent this variable and then include both linear and quadratic terms for these numeric variables in the model.
So, while the dummy variable method is not directly used in quadratic regression models, you can still include categorical variables in a modified numeric format and consider their linear or quadratic effects in the model.
What about daily forecast, how do we create matrix?
Hi Sushma,
You must create separate sets of dummy variables for different seasonality:
- To capture day of the week seasonality, create 6 dummy variables.
- To capture day of the month seasonality, create 30 dummy variables
- To capture month of the year, create 11 dummy variables.
I would like to thank you so much
Hi, Hafiz. Thanks a lot for the feedback! 🤝
Amazing
Thanks a lot for the appreciation! ❤️
P value error #num and some very low coefficients for significance f 0.8465 f 0.56 p greater than 0.05. please do similar videos on different. Stat methods
Hi Tina, thanks for watching the video!
interesting example. Science based :) What do you think about latest forecast functions included in the "pack" =FORECAST.ETS.SEASONALITY()?
Dear Catalin, Thank you for the appreciation!
We can arrive at the same results as the ones I did in the video by simply using the Excel function FORECAST.ETS().
For example the forecasted value of 01-Jan-18 (380.432$ in cell D38) can be calculated like this: =FORECAST.ETS(A38,$C$2:$C$37,$A$2:$A$37,1,1,1) and the result would be 378.675$, very close to the manual method.
The related function FORECAST.ETS.SEASONALITY() is just telling you what is the seasonality that the previous function FORECAST.ETS() was based on. If we aply it to our case:
=FORECAST.ETS.SEASONALITY(C2:C37,A2:A37) we get the result 12, just what we have expected (12 months).
I hope it helps!
why time period t is used ?.
t is the trend. The rest are the seasons.
why don't you use December?
Hello, thanks for the question.
That is because the 12-th category is captured by the intercept, and is specified when all the 11 dummy variables are all set to zero.
This is the general rule: one variable less then the number of categories. If you have just 2 categories (yes and no) you put just one dummy (yes).
thanks@@Data.Analytics.Central .
All my p values are bigger than 0.05, even my f and r2 are good. Then I cannot use this way, right?
finally im using =forecast.ets()
why dummy variable was used
Dummy variables are useful because they allow us to include categorical variables in our analysis, which would otherwise be difficult to include due to their non-numeric nature.
Disaponted. You didn't tell how to get trend (series) out from the time series.
Intercept + time trend * period number (1 to n)
Its giving a biased forecast line
Hi, I hope your data has a linear relationship between the independent variables and the dependent variable.