Thanks a lot. Please more videos on panel data analysis-covering simaltaneous equation or structural equation modelling, dynamic panel data modelling and more. Your explanation is really understanable.
Hello. To get two ways fixed effects, specify effect="twoways" in the function call. Else the default is effect="individual". Hence the different coefficient you get w.r.t. lm(). 'index' specifies the two indices of the panel, not the FE. If you want to see the fixed effects, fixef(test.fixed).
Thanks for that reference-- I'll add it to the pile of "to-reads". As you say, if you include Fixed Effects when they aren't needed, you are losing degrees of freedom for no reason. However, in the kind of work I do it seems to me to be a good idea to try them to see what happens- if you are constrained in degrees of freedom, then trying to get rid of them makes sense (if appropriate). But experts certainly know more than I, I am only good at explaining the basics!
Yes, absolutely! A basic fixed effects model is nothing mysterious- use all of your normal OLS techniques! Interactions should be used when appropriate, especially if there might be an interaction with a key explanatory variable. A common use would be to interact education and race when explaining income, since 1 additional year of schooling may (and DOES) have a different impact on the salaries of whites, black, Hispanics, Asians, etc. and probably also differs for males and females.
Thanks a lot. If possible, please upload more videos on Panel data covering simaltaneous equation or structural equation modelling, dynamic panel data modelling, and more. Your explanation is very understandble.
I actually run a "mixed" model here, but it is nothing special. This means that you run fixed effects for one repeated characteristic, but leave the other one uncontrolled. So, if you include dummies for the students (fixed) but don't control for the test (leaving it as a random effect), you have a "Mixed" model. As I said in the video, I would caution against running such a model unless there is a VERY good reason for not controlling for a category.
I was unsure about your question, so I Googled... This might be a case where statisticians and econometricians are talking about two different things. I skimmed through an article in "Research Synthesis Methods" DOI: 10.1002/jrsm.12 discussing this, and they discussed this "common mean" idea, but I'll be honest: I have no idea what they are getting at. I'll look at this more and see if I can translate it into either English or econometrics.
Yes, I don't use Stata, and so can't comment on specifics. However, the Hausman test should never give a negative value because it is a chi-square test. The test statistic is "SORT OF" the sum of squared differences between the two sets of slope coefficients (details in a future video!). Since it is a sum of squares, it must be positive. Email me your output and I can try to make a quick guess about what is going on.
Sorry, I figured out a solution and I should have updated my question. I found that by adding effect="twoways" to the plm function, it uses both individual and time fixed effects. Thanks for your videos!
Hi! I'm having trouble to realize which is the difference (in the plm pack) between using the option model="within" and use effect="individual" . Maybe you already pass trough that.
Thanks for the helpful video. Could you please elaborate on the differences between fixed effects and random effects when it comes to extrapolation outside of the sample? Does FE necessarily imply that no estimates can be made for groups outside the training sample? For some reason Stata does seem to compute predictions for observations that are in groups that has not been trained upon, which puzzles me.
Thank you so much Sir. The reference is quite helpful. I wanted to estimate a markov switching model on panel data but there was no way to account for the panel structure. So I was suggested to run a simple panel model and test for FE or RE. If the Hausman test suggested FE, then the underlying assumption of common mean holds and then I could proceed with estimating a Markov switching model on pooled data, with switching intercept.
More explanatory variables (even hundreds) makes absolutely no difference. Of course, it would be rare to have data as simple as the made-up data I used. There are lots of papers that do exactly what you are talking about, see for example Patrick Bayer's recent paper (NBER # 18069). What makes it a panel are repeated observations of the same house. RE vs. FE is the question of whether to include dummies to control for the house in question. The answer is almost always YES.
Ok this is good video. However i was wondering how do you work with a model that has more than one variable? say like six variables for panel data ?? e.g the variables that affect the price of Housing ( parks, roads, lot size, bathrooms, size of yard etc?)
After reading: I only find people mentioning a "common mean" doing meta-analysis in epidemiology/education, not econometrics. This language seems backwards, but their focus is different. They run an FE model to test if they need to correct the variance for "serial correlation" as I described using the RE correction. If their dummy variables are =0, they call that an "FE model for estimating a mean"- very odd. Using the term "common mean" in a normal econometric setting IS confusing, forget it.
Hi Sir, if Hausman test indicates that fixed model is more appropriate than random effect model, and if in that case, in data time period (T) > cross section units (N), which FEM is to be chosen: time (T) FEM or Cross section (N) FEM?
Thank you for clearly instruction with fixed and random effects model. It is really helpful. Could you please explain why when we control the indexes in R code in both Test.random and Testfixed, R only worked with the first index for example: index =c( 'Student','Test'), it worked with Student only as in your video. Could we use other codes in order to deal with some more indexes? Best regards!
Your video was helpful to me and I have a suggestion (I am sorry if suggestion is a strong word, may be it is just my guess) At 18:00 My guess is there is a difference in results because the code for fixed effects does not say effect = c("twoways") My suggested code would be: test.fixed
You mentioned the reason for including fixed effects for students in the model being the fixed effects are likely correlated with the other explanatory variables. But I don't see that as the prime reason. Even if the student fixed effects are completely uncorrelated & independent of the other expl. variables, the FE variables may still be correlated with the response variable. Including the FEs in the regression would then reduce the model error.
It all depends on the point of why you are running a regression. If your objective is only to form a predictive model, then yes, your perspective might be correct. However, from the perspective of a researcher who is trying to test a theory, e.g. how does study time affect scores, the primary reason is to eliminate bias, not improve model accuracy. The "model error" is not relevant for testing a hypothesis, but bias will destroy any possibility of getting fix on the "true" impact of a variable. On the other hand, I am right now creating a solely predictive model, and so am throwing in every sort of garbage variable I can find (and the squares of them, etc.) since I don't care about testing anything. But, this is a rare thing for an economist like me to do.
Thanks for the helpful response. For predictive models with a lot of "garbage" variables, I've found penalized regression methods that offer feature selection or coefficient shrinkage to be helpful (like ridge regression, lasso, and elastic net). Is there any use for a random effect term in a penalized regression model if serial correlation is present (or say member level correlation between obs)? Ignoring the correlation doesn't bias the variable coefficients. However the coeff. estimates will no longer be efficient & show high variability from sample to sample.
Thank you so much for your video. I am trying to use panel model but the pob-up of "near singular matrix" came up. what should I do to fix this? Thank you so much
Well, this probably means that you have either 1) Way too many fixed effects compared to observations. For example, if you observed 2 time periods for 100 people that live in 50 different states, and you are trying to control for person effects and state effects. or, 2) Some of your other variables are collinear or almost so. Sometimes people do this by having both a "year" or "age" variable and an "experience" variable-- both of these increase by 1 year per year! So, are collinear (or almost so). Sometimes this kind of thing can also occur when you have lots of other dummy variables for education, race, occupation... with enough of them the combination will become collinear in some cases.
very well explained. Thank you for the resource. I have a question: I could not understand how Fixed effects model imply a common mean for the sample, while random effects does not.
Hello, I am using panel data. Hausman test specifies that fixed effect model is appropiate with my data. But the prior research which I want to replicate applied random effect model. That research used much older data but my data is very recent. Moreover, sample size in my case is much more than prior one. Is it possible that Panel data analysis model can be varied based on similar data over different time and greater number of obhservations? I hope I could make you understand my problem
Did you ever figure out why the plm() function was not including the time fixed effect? I think I am having the same problem, but can't figure out how to fix it. ?plm didn't say much.
I don't know your context. You can make out of sample predictions, but be careful. If I want to predict "Joe's" future test grade it depends on how hard the test is: an unknown. But, we can predict Joe will do about 6 points higher than Antonio, but 5-6 less than Mark, IF they all study the same amount. Trying to predict for "Gabriel" (an unknown), we don't know his fixed effect, REGARDLESS of whether you are running an FE or RE. This doesn't mean an RE is "Better" somehow-wrong is wrong.
Personally, I don't know anyone who uses Eviews, so I don't know how to do it. I used it once 10 years ago, and it didn't impress me very much. Most econometricians either use R, Stata, or Matlab these days, though some still use LIMDEP or Gauss for some things. No package I know of will automatically give you all of the necessary diagnostics automatically, however.
Thank you very much for your very detailed explanation. I must say that I havent seen a better one on the youtube. I just have one question: the way you explain the reason behind possible autocorrelation (adding 10 to the errors) isnt it captured by the intercept? I read somewhere that intercept ensures that error is centered around 0. Isnt it?
+Pehlwan You are right, the OVERALL error for all observations is centered around zero. However, for each individual, it may not be, unless you include the fixed effect. For example, observations for ME might ALL be above the regression line (so yi-yhat >0). Observations for someone else will ALL be below the regression line. Perhaps the following video will give a better visual interpretation. Leaving out the dummies for females and males and being single create some groups with all positive residuals, and others with all negative. th-cam.com/video/d87fuWEBbbE/w-d-xo.html
One question I would like to inquire you is that how to use within estimator in Eviews? If I generate demeaned series and estimate with OLS, even the estimation is somewhat right but the standard error becomes wrong for the wrong degree of freedom. In the meanwhile, manually computing standard error is fussy.
Why are you demeaning? Why do you think the df and se are wrong? Because it is still estimating an intercept? I don't use Eviews (and am not sure why anyone would these days-- Eviews even now has an R interface so you can do the many things that Eviews cannot-- and R is FREE. The R hexview package can read in your eviews files.), so I can't help. Just use R-- here is a link to a good summary of R and Panel Data, with good use of plots: www.princeton.edu/~otorres/Panel101R.pdf.
Masud- I think that a random effects model is almost NEVER the right model to use. People used them a lot in the past, but now people are beginning to understand that RE models are improper (most of the time). If your test says that FE is the right model, then it IS the right model. You will need to explain why you are using the FE instead of the RE, but don't worry that earlier studies did it incorrectly. Cheers! -Dr. B
Thank you for the video, I have one question, I have a set of panel data and I ran each of OLS, Fixed Effects and Random effects models, then I ran test to check for OLS vs Fixed Effects and ran the Hausman test to check Fixed Effects vs Random Effects. In both results I found each of OLS and Random effects models as more appropriate than Fixed Effects model however I am still in dilemma on how to compare now which is more appropriate of OLS vs Random Effects, hope you have an advice on what test I have missed in order to check OLS vs Random Effects, thx again.
1) The Hausman test just says that you CAN leave out your F.E. without biasing your slope coefficients. i.e., FE uncorrelated with other explanatory variables. 2) However, the Hausman test is NOT telling you that you MUST leave out your F.E.-- as I mentioned, I often want to leave in the F.E. because I want to learn from them. However, this comes at a cost (eating up degrees of freedom) 3) If you really don't care about the estimates for the F.E. or they are eating up too many degrees of freedom, then leave them out, but leaving them out causes biased standard errors if you use OLS, so use random effects to correct the s.e.
BurkeyAcademy Thank you, I am analysing 51 companies, could we conclude then that I should always aim at using the Random Effects as superior to OLS unless the dummy variables are proven to be all insignificant ? If you agree what test should we take to compare OLS vs Random effects, is it Wald test ?
1) I never work with unbalanced panels, so you are on your own there. 2) The null hypothesis is "Omitting fixed effects does not significantly bias coefficients". If your p value is small, the removing fixed effects is biasing your coefficients. So, don't remove them. 3) However, in my opinion, you should not remove the fixed effects unless there is a very good reason to; e.g. the fixed effects are removing variability you need to estimate an important relationship (e.g., you are studying how state laws regarding sales taxes relate to State GDP- you might not be able to estimate this if you include both state fixed effects and a state fixed effect).
I personally do not deal with datasets with missing data- I know some work has been done on this question, but am not sure where to point you for more information.
hi, i am a stata user but i found the procedure used for R similar to it, so hope my question can be answered. when you're performing phtest, what if you swap around fe and re? does it give you a negative statistic? i encountered with a negative value from Hausman test, but there seems to be no definitive answer about what went wrong and how to correct it. thanks in advance
Thank you for your clear video tutor, but i face difficult in modeling my data, i have one dependent(time series) and three independent( time series) variables. and i want to know the effect of independent variables on the dependent variable. what model should i use sir? ohh sorry!! the data are collected from a single nation.
Time series is a very complicated subject, and one that I do not know much about. There is no way for anyone to tell you the right model without knowing the data and analyzing the data carefully for a LONG time. There are dozens of time series models to choose from, and a lot of tests one needs to do. "Econometric Models and Economic Forecasts" by Pindyck and Rubenfeld is probably the best introductory book that gives a good introduction to Time Series (at the end of the book)- but it will take months of serious study.
Thank you for the elaboration. but i have a question. You say that while using random effect model we assume it omits dumies. but in my dataset that is 5 years long i have categorical variables such as marital status and it has dummies for example 1=single, 2= married, 3= divorces and so on. similarly job tenure: 1= 1 month, 2= 4months and so on. So my question is wont these dummies act as individual specific effect when i am using random effect model because each individual has different marital status and job tenure??
Umer Saqib No, these won't act as an individual fixed effect, because these variables will measure a general relationship that is-- for everyone -- about how job tenure and marital status affects say, income. An individual effect measures how each person is different, unrelated to anything that is know about that person.
BurkeyAcademy Thanks a lot for you reply. One more question that i wanted to ask was that if I am using Stata 12 to run the random effect estimator then do I have to adjust for serial correlation or the software does that itself? Would be great if you can also make a video on how to interpret random and fixed effects results on panel data.
Great video, as always! One question, can you do a fixed effect model where the dummies interact with X1? So that you get different slopes for each dummy and not only different y intercepts?
At the very beginning of the video there is a green box with a link. Or, Go to my website burkeyacademy com, click on "Statistics/Econometrics", and these two videos are at the bottom of the right hand column.
As you have very few second and third level units, and as you dont have any second and third level predictors, I see no reason for using a random effects models. Random effects models aren't for the cases when we are interested in different second and third level predictors? For example when we are also modelling individuals level (second-level) característics, like gender, race, SES, or school característics (third level), like contextual effects, peer effects, etc.. Also by the last research I have seem, random effects require > 30 higher level units to be unbiased.
This is definitely the clearest explanation I have ever come across!!
Glad to be of service ☺
I love the way how you put complex phenomena in easily understandable words. Thanks a lot for this work!
Thanks for giving such a simple and nice explanation about fixed and random effects models. have been waiting for this lecture...
Thank you very much. Your explanation for fixed and random effect models are very clear and easy to understand.
Your two videos on panel data are excellent. Many thanks.
Thanks, this is helping for my Master Thesis!
This is a great tutorial to understand FE vs RE in plain English
Thanks a lot. Please more videos on panel data analysis-covering simaltaneous equation or structural equation modelling, dynamic panel data modelling and more. Your explanation is really understanable.
Thank you so much for all your videos and escpecially for these two videos on panel data. Very clear explanation!
Hello. To get two ways fixed effects, specify effect="twoways" in the function call. Else the default is effect="individual". Hence the different coefficient you get w.r.t. lm(). 'index' specifies the two indices of the panel, not the FE. If you want to see the fixed effects, fixef(test.fixed).
+theSculler Thank you so much for the info. I think +BurkeyAcademy should definitely add this to the video.
Thanks for that reference-- I'll add it to the pile of "to-reads". As you say, if you include Fixed Effects when they aren't needed, you are losing degrees of freedom for no reason. However, in the kind of work I do it seems to me to be a good idea to try them to see what happens- if you are constrained in degrees of freedom, then trying to get rid of them makes sense (if appropriate). But experts certainly know more than I, I am only good at explaining the basics!
Yes, absolutely! A basic fixed effects model is nothing mysterious- use all of your normal OLS techniques! Interactions should be used when appropriate, especially if there might be an interaction with a key explanatory variable. A common use would be to interact education and race when explaining income, since 1 additional year of schooling may (and DOES) have a different impact on the salaries of whites, black, Hispanics, Asians, etc. and probably also differs for males and females.
please upload more videos on Panel data analysis. You are really awesome
Thanks a lot. If possible, please upload more videos on Panel data covering simaltaneous equation or structural equation modelling, dynamic panel data modelling, and more. Your explanation is very understandble.
wonderful job explaining the difference and the concepts!
I actually run a "mixed" model here, but it is nothing special. This means that you run fixed effects for one repeated characteristic, but leave the other one uncontrolled. So, if you include dummies for the students (fixed) but don't control for the test (leaving it as a random effect), you have a "Mixed" model. As I said in the video, I would caution against running such a model unless there is a VERY good reason for not controlling for a category.
This video made my life.
I was unsure about your question, so I Googled... This might be a case where statisticians and econometricians are talking about two different things. I skimmed through an article in "Research Synthesis Methods" DOI: 10.1002/jrsm.12 discussing this, and they discussed this "common mean" idea, but I'll be honest: I have no idea what they are getting at. I'll look at this more and see if I can translate it into either English or econometrics.
Your vids are purely awesome and very good to understand! Thank you very much!
Yes, I don't use Stata, and so can't comment on specifics. However, the Hausman test should never give a negative value because it is a chi-square test. The test statistic is "SORT OF" the sum of squared differences between the two sets of slope coefficients (details in a future video!). Since it is a sum of squares, it must be positive. Email me your output and I can try to make a quick guess about what is going on.
Sorry, I figured out a solution and I should have updated my question. I found that by adding effect="twoways" to the plm function, it uses both individual and time fixed effects. Thanks for your videos!
Hi! I'm having trouble to realize which is the difference (in the plm pack) between using the option model="within" and use effect="individual" . Maybe you already pass trough that.
I never had a chance... but I just added it to my "to do list". I'll try to work on it soon.
Thanks for the helpful video. Could you please elaborate on the differences between fixed effects and random effects when it comes to extrapolation outside of the sample? Does FE necessarily imply that no estimates can be made for groups outside the training sample? For some reason Stata does seem to compute predictions for observations that are in groups that has not been trained upon, which puzzles me.
surprisingly well done
Thank you so much Sir. The reference is quite helpful. I wanted to estimate a markov switching model on panel data but there was no way to account for the panel structure. So I was suggested to run a simple panel model and test for FE or RE. If the Hausman test suggested FE, then the underlying assumption of common mean holds and then I could proceed with estimating a Markov switching model on pooled data, with switching intercept.
More explanatory variables (even hundreds) makes absolutely no difference. Of course, it would be rare to have data as simple as the made-up data I used. There are lots of papers that do exactly what you are talking about, see for example Patrick Bayer's recent paper (NBER # 18069). What makes it a panel are repeated observations of the same house. RE vs. FE is the question of whether to include dummies to control for the house in question. The answer is almost always YES.
Ok this is good video. However i was wondering how do you work with a model that has more than one variable? say like six variables for panel data ?? e.g the variables that affect the price of Housing ( parks, roads, lot size, bathrooms, size of yard etc?)
After reading: I only find people mentioning a "common mean" doing meta-analysis in epidemiology/education, not econometrics. This language seems backwards, but their focus is different. They run an FE model to test if they need to correct the variance for "serial correlation" as I described using the RE correction. If their dummy variables are =0, they call that an "FE model for estimating a mean"- very odd. Using the term "common mean" in a normal econometric setting IS confusing, forget it.
Thanks for the lecture. But please how may I use the 'lm' package instead of 'plm' to perform the Hausman test?
Hi Sir, if Hausman test indicates that fixed model is more appropriate than random effect model, and if in that case, in data time period (T) > cross section units (N), which FEM is to be chosen: time (T) FEM or Cross section (N) FEM?
Thank you for clearly instruction with fixed and random effects model. It is really helpful. Could you please explain why when we control the indexes in R code in both Test.random and Testfixed, R only worked with the first index for example: index =c( 'Student','Test'), it worked with Student only as in your video. Could we use other codes in order to deal with some more indexes? Best regards!
Your video was helpful to me and I have a suggestion (I am sorry if suggestion is a strong word, may be it is just my guess)
At 18:00 My guess is there is a difference in results because the code for fixed effects does not say
effect = c("twoways")
My suggested code would be:
test.fixed
You mentioned the reason for including fixed effects for students in the model being the fixed effects are likely correlated with the other explanatory variables. But I don't see that as the prime reason. Even if the student fixed effects are completely uncorrelated & independent of the other expl. variables, the FE variables may still be correlated with the response variable. Including the FEs in the regression would then reduce the model error.
It all depends on the point of why you are running a regression. If your objective is only to form a predictive model, then yes, your perspective might be correct. However, from the perspective of a researcher who is trying to test a theory, e.g. how does study time affect scores, the primary reason is to eliminate bias, not improve model accuracy. The "model error" is not relevant for testing a hypothesis, but bias will destroy any possibility of getting fix on the "true" impact of a variable. On the other hand, I am right now creating a solely predictive model, and so am throwing in every sort of garbage variable I can find (and the squares of them, etc.) since I don't care about testing anything. But, this is a rare thing for an economist like me to do.
Thanks for the helpful response. For predictive models with a lot of "garbage" variables, I've found penalized regression methods that offer feature selection or coefficient shrinkage to be helpful (like ridge regression, lasso, and elastic net). Is there any use for a random effect term in a penalized regression model if serial correlation is present (or say member level correlation between obs)? Ignoring the correlation doesn't bias the variable coefficients. However the coeff. estimates will no longer be efficient & show high variability from sample to sample.
Great video. Thanks for taking the time.
Thank you so much for your video. I am trying to use panel model but the pob-up of "near singular matrix" came up. what should I do to fix this? Thank you so much
Well, this probably means that you have either 1) Way too many fixed effects compared to observations. For example, if you observed 2 time periods for 100 people that live in 50 different states, and you are trying to control for person effects and state effects. or, 2) Some of your other variables are collinear or almost so. Sometimes people do this by having both a "year" or "age" variable and an "experience" variable-- both of these increase by 1 year per year! So, are collinear (or almost so). Sometimes this kind of thing can also occur when you have lots of other dummy variables for education, race, occupation... with enough of them the combination will become collinear in some cases.
very well explained. Thank you for the resource. I have a question: I could not understand how Fixed effects model imply a common mean for the sample, while random effects does not.
Hello,
I am using panel data. Hausman test specifies that fixed effect model is appropiate with my data. But the prior research which I want to replicate applied random effect model. That research used much older data but my data is very recent. Moreover, sample size in my case is much more than prior one. Is it possible that Panel data analysis model can be varied based on similar data over different time and greater number of obhservations? I hope I could make you understand my problem
Thanks for your reply.
Excellent video, thanks a lot
Did you ever figure out why the plm() function was not including the time fixed effect? I think I am having the same problem, but can't figure out how to fix it. ?plm didn't say much.
try to use:
plm(TestGrade~Studytime + Student + Test, index = c("Student", "Test"), model = "within", data = testdat)
I don't know your context. You can make out of sample predictions, but be careful. If I want to predict "Joe's" future test grade it depends on how hard the test is: an unknown. But, we can predict Joe will do about 6 points higher than Antonio, but 5-6 less than Mark, IF they all study the same amount. Trying to predict for "Gabriel" (an unknown), we don't know his fixed effect, REGARDLESS of whether you are running an FE or RE. This doesn't mean an RE is "Better" somehow-wrong is wrong.
I like it. Can you, pls help me how to run on stata bartlett test using panel datas in multivariet model?
KRGS
Personally, I don't know anyone who uses Eviews, so I don't know how to do it. I used it once 10 years ago, and it didn't impress me very much. Most econometricians either use R, Stata, or Matlab these days, though some still use LIMDEP or Gauss for some things. No package I know of will automatically give you all of the necessary diagnostics automatically, however.
Thank you very much for your very detailed explanation. I must say that I havent seen a better one on the youtube. I just have one question: the way you explain the reason behind possible autocorrelation (adding 10 to the errors) isnt it captured by the intercept? I read somewhere that intercept ensures that error is centered around 0. Isnt it?
+Pehlwan You are right, the OVERALL error for all observations is centered around zero. However, for each individual, it may not be, unless you include the fixed effect. For example, observations for ME might ALL be above the regression line (so yi-yhat >0). Observations for someone else will ALL be below the regression line. Perhaps the following video will give a better visual interpretation. Leaving out the dummies for females and males and being single create some groups with all positive residuals, and others with all negative. th-cam.com/video/d87fuWEBbbE/w-d-xo.html
One question I would like to inquire you is that how to use within estimator in Eviews? If I generate demeaned series and estimate with OLS, even the estimation is somewhat right but the standard error becomes wrong for the wrong degree of freedom. In the meanwhile, manually computing standard error is fussy.
Why are you demeaning? Why do you think the df and se are wrong? Because it is still estimating an intercept? I don't use Eviews (and am not sure why anyone would these days-- Eviews even now has an R interface so you can do the many things that Eviews cannot-- and R is FREE. The R hexview package can read in your eviews files.), so I can't help. Just use R-- here is a link to a good summary of R and Panel Data, with good use of plots: www.princeton.edu/~otorres/Panel101R.pdf.
Adam L Thank you.
Thank you for these useful videos,
I Guess the plm package didn't take into account the test fixed effect since the test variable is quantitative!
Masud-
I think that a random effects model is almost NEVER the right model to use. People used them a lot in the past, but now people are beginning to understand that RE models are improper (most of the time). If your test says that FE is the right model, then it IS the right model. You will need to explain why you are using the FE instead of the RE, but don't worry that earlier studies did it incorrectly. Cheers! -Dr. B
thank you. would you talk about mixed effect too?
Thank you for the video, I have one question, I have a set of panel data and I ran each of OLS, Fixed Effects and Random effects models, then I ran test to check for OLS vs Fixed Effects and ran the Hausman test to check Fixed Effects vs Random Effects. In both results I found each of OLS and Random effects models as more appropriate than Fixed Effects model however I am still in dilemma on how to compare now which is more appropriate of OLS vs Random Effects, hope you have an advice on what test I have missed in order to check OLS vs Random Effects, thx again.
1) The Hausman test just says that you CAN leave out your F.E. without biasing your slope coefficients. i.e., FE uncorrelated with other explanatory variables.
2) However, the Hausman test is NOT telling you that you MUST leave out your F.E.-- as I mentioned, I often want to leave in the F.E. because I want to learn from them. However, this comes at a cost (eating up degrees of freedom)
3) If you really don't care about the estimates for the F.E. or they are eating up too many degrees of freedom, then leave them out, but leaving them out causes biased standard errors if you use OLS, so use random effects to correct the s.e.
BurkeyAcademy Thank you, I am analysing 51 companies, could we conclude then that I should always aim at using the Random Effects as superior to OLS unless the dummy variables are proven to be all insignificant ? If you agree what test should we take to compare OLS vs Random effects, is it Wald test ?
If data is unbalanced panel & P-value is less than 5% for Hausman test so what will be the outcome? It will show fixed effect or random effect???
1) I never work with unbalanced panels, so you are on your own there. 2) The null hypothesis is "Omitting fixed effects does not significantly bias coefficients". If your p value is small, the removing fixed effects is biasing your coefficients. So, don't remove them. 3) However, in my opinion, you should not remove the fixed effects unless there is a very good reason to; e.g. the fixed effects are removing variability you need to estimate an important relationship (e.g., you are studying how state laws regarding sales taxes relate to State GDP- you might not be able to estimate this if you include both state fixed effects and a state fixed effect).
@@BurkeyAcademy Thank you sir for explaining this concept.
This is very well done but how do you handle missing data?
I personally do not deal with datasets with missing data- I know some work has been done on this question, but am not sure where to point you for more information.
hi, i am a stata user but i found the procedure used for R similar to it, so hope my question can be answered. when you're performing phtest, what if you swap around fe and re? does it give you a negative statistic? i encountered with a negative value from Hausman test, but there seems to be no definitive answer about what went wrong and how to correct it. thanks in advance
Thank you for your clear video tutor, but i face difficult in modeling my data, i have one dependent(time series) and three independent( time series) variables. and i want to know the effect of independent variables on the dependent variable. what model should i use sir? ohh sorry!! the data are collected from a single nation.
Time series is a very complicated subject, and one that I do not know much about. There is no way for anyone to tell you the right model without knowing the data and analyzing the data carefully for a LONG time. There are dozens of time series models to choose from, and a lot of tests one needs to do. "Econometric Models and Economic Forecasts" by Pindyck and Rubenfeld is probably the best introductory book that gives a good introduction to Time Series (at the end of the book)- but it will take months of serious study.
Thank you very much!
Thank you for the video. By the way, the URL of your Web site (printed on each slide) seems to have a typo.
Yes, thank you for the note... I a horrible with my typos! I added a note mentioning the correct web address now.
Thank you for the elaboration. but i have a question. You say that while using random effect model we assume it omits dumies. but in my dataset that is 5 years long i have categorical variables such as marital status and it has dummies for example 1=single, 2= married, 3= divorces and so on. similarly job tenure: 1= 1 month, 2= 4months and so on. So my question is wont these dummies act as individual specific effect when i am using random effect model because each individual has different marital status and job tenure??
Umer Saqib No, these won't act as an individual fixed effect, because these variables will measure a general relationship that is-- for everyone -- about how job tenure and marital status affects say, income. An individual effect measures how each person is different, unrelated to anything that is know about that person.
BurkeyAcademy Thanks a lot for you reply. One more question that i wanted to ask was that if I am using Stata 12 to run the random effect estimator then do I have to adjust for serial correlation or the software does that itself? Would be great if you can also make a video on how to interpret random and fixed effects results on panel data.
Great video, as always!
One question, can you do a fixed effect model where the dummies interact with X1? So that you get different slopes for each dummy and not only different y intercepts?
What is the video that preceded this?
1) I don't use STATA.
2) I will never do your homework for you. First, I don't have time, and second, that is cheating.
Thanks a lot Sir...
At the very beginning of the video there is a green box with a link. Or, Go to my website burkeyacademy com, click on "Statistics/Econometrics", and these two videos are at the bottom of the right hand column.
As you have very few second and third level units, and as you dont have any second and third level predictors, I see no reason for using a random effects models. Random effects models aren't for the cases when we are interested in different second and third level predictors? For example when we are also modelling individuals level (second-level) característics, like gender, race, SES, or school característics (third level), like contextual effects, peer effects, etc.. Also by the last research I have seem, random effects require > 30 higher level units to be unbiased.
I mean, if the only predictor that we are interested in is a first level predictor (time-individual), we should go for fixed effects, no?
Thanks! Tom Hanks, eh? I guess I can see that!
true, though if theres no unobserved heterogeneity
moral= model
I don't use STATA, sorry!
799 like masha Allah