Statistics with R (3) - Generalized, linear, and generalized least squares models (LM, GLM, GLS)

Christoph Scherber

มุมมอง 128 802

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 7 ธ.ค. 2024

ความคิดเห็น • 95

@farhadwaseel9981 5 ปีที่แล้ว ⁺²
Thank You Dr, You have explained all fluently and smoothly...
@latinhero1818 5 ปีที่แล้ว ⁺¹
I wish I'd seen this video last month. Better late than never. Subscribed!
@LuisGeorge 10 ปีที่แล้ว ⁺²
I think that this one of the best tutorial ever. Thanks a lot Herr Professor
@getinettafesetucho7999 6 ปีที่แล้ว ⁺²
Honestly speaking, you are a wonderful teacher! You saved me a lot of effort, to be frank. Thank you!!
@ChristophScherber 5 ปีที่แล้ว
thank you so much!
@binbinzhou6938 6 ปีที่แล้ว ⁺¹
Vielen Dank Christoph, Sie haben wirklich top erklärt! Mache weiter so!!!!!! ich bin big Fan von Ihnen!!!!
@angelamartin3884 4 ปีที่แล้ว ⁺¹
thanks a lot for your time and clear explanation is was priceless, I hope you can share more videos :) I'm looking forward to GLM part 2
@X_explorer 7 ปีที่แล้ว ⁺¹
Cool video! I like observing nature, discover and decompress...
@ditke71 10 ปีที่แล้ว ⁺³
You are an excellent teacher! thanks!
@HG1921 10 ปีที่แล้ว ⁺²
Thank you so much. A nice, clear and excellent lecture on gls and lm. I have used and applied the lesson learned on my analysis. It came out wonderful!!
I hope you may add some example videos on linear mixed models with lmer function from the lme4 package. I am sure you would give us a wonderful understanding how the Random effect structure would remove/minimize or handle spatial/temporal variations or differences as the gls does in the case of spatially/temporally correlated residual errors.
@ChristophScherber 10 ปีที่แล้ว ⁺²
Hi John,
Thanks very much for your suggestion! I just added a first video on lme models. More complex models will be covered later.
Best regards,
Christoph
@jeffatennis 6 ปีที่แล้ว ⁺¹
terrific video...this is exactly what i needed!!
@kirangadhave 11 ปีที่แล้ว ⁺²
Great video! Thanks for posting this.
@Ztube111 11 ปีที่แล้ว ⁺¹
Wasn't aware of the As.Date object...great!
@gustavotenorio4201 8 ปีที่แล้ว ⁺¹
Thank you very much! Not only for this, but also for the other videos! They helped me a lot! I think that would be a nice video if you use those really simple examples in simulations with Montecarlo...... that would be epic!
@angeld5093 8 ปีที่แล้ว ⁺¹
Thank you so much for your tutorials! Very helpful
@kvafsu225 2 ปีที่แล้ว
Very good presentation. Thanks
@melissad3390 8 ปีที่แล้ว ⁺¹
you went on my favs, very good vid
@werzenable 9 ปีที่แล้ว ⁺²
You are excellent! Keep up the good work!
Greetings from Sweden!
@ChristophScherber 9 ปีที่แล้ว
+Erik Wersäll thanks a lot!
@weltallaaf 10 ปีที่แล้ว ⁺²
Thanks for your interesting and helpful viedo.
Maybe you could use typical R conventions for your code, to keep it a little more readable?
@学锋王 10 ปีที่แล้ว ⁺¹
very helpful, thank you so much for your nice tutorial.
@rasmusstokkeland 10 ปีที่แล้ว ⁺⁶
If you could add chapters (in the video) so you could jump strait to e.g: GLM that would be great. Saves time when you just want to see GLM.
Great video by the way!!!
@ChristophScherber 10 ปีที่แล้ว
Thanks! I´ll try to implement this in the next videos.CheersChristoph
@AdrianaHernandez-bt7ys 7 ปีที่แล้ว ⁺¹
Super didactic, thank you very much!
@ChristophScherber 5 ปีที่แล้ว
thanks!
@getinettafesetucho7999 5 ปีที่แล้ว ⁺¹
Thank you for such a great job!
@terrencevance6799 11 ปีที่แล้ว
Excellent video, thank you
@petraschubert8220 6 ปีที่แล้ว ⁺¹
Which package I need to install on R-Studio for Mac to run the GLS-models? I just cannot find the right one! These tutorials are great thank you a lot. You got some didactical talent!
@fazlihaleem6603 6 ปีที่แล้ว ⁺¹
you need to install the "nmle" package
@TheAmibusy 10 ปีที่แล้ว ⁺¹
I code following in video but I can not get ACF plot . How to get it?
@mikapylvanainen8169 10 ปีที่แล้ว
Thanks Christoph for great and understandable tutorial.
Do you have a tutorial that gives guidance how to handle multicollinearity between explanatory variables in regression analysis?
If I have understood right, VIF (Variance Inflation Factor) is used as an indicator of multicollinearity and it is somehow utilized
also in R when handling milticollinearity in regression analysis.
Thanks in advance for your advice.
br. Mika
@eceviteyduran5628 5 ปีที่แล้ว
Many thanks for your invaluable information and explanation about the regression modeling. Could you please provide a bit more information for us.
Best Regards
@reynoldokudzeto5129 8 ปีที่แล้ว
Christoph, can please add more lectures on identifying non linearity when building a model and how to improve the models? Also do you have videos on DOEs and latent variable analysis (PCA etc)
@hectortamez64 8 ปีที่แล้ว
Dude, i love you. I really do. Thank you very much.
@hamzanamouchi6380 6 ปีที่แล้ว ⁺¹
Thank you very much :)
@kendramillard564 9 ปีที่แล้ว ⁺¹
Many thanks for the super useful tutorial. What did not like so much, that it could be misunderstood that the simple use of glm() instead of lm() already results in the better model, because you spent plenty of time explaining the issue of how the glm uses exponential functions. However, the superiority of the glm-fit is because of the family selection, as the output of lm() is equivalent to glm(,family="gaussian"). Which is were i stumbled. Because i have mathematically no real idea of what the link functions are actually doing. I just know how and when to apply those families. A tutorial on link functions would be really great :)
@ChristophScherber 9 ปีที่แล้ว
+Kendra Millard Thanks for these comments, I´ll provide more details on GLMs in new videos. Best regards, Christoph
@kamrantaherkhani2066 5 ปีที่แล้ว ⁺¹
Hi, why the errors for the glm model is not e^error? ok I saw later but it was not there when the model was written, thanks
@Ztube111 11 ปีที่แล้ว ⁺¹
Any plans to model MLE / likelihood ratio models? Nice work!
@Mazoke8989 8 ปีที่แล้ว ⁺¹
THANK YOU VERY MUCH!
@Cupania 11 ปีที่แล้ว ⁺¹
Good, video. How would you then graph the GLM? Can you follow-up with a video showing this? Thank you!!!
@duckwantbreads 8 ปีที่แล้ว
One easy way to add a Poisson glm would be to make a vector
y=1:20
(all the integers from 1:20) and then write
predict1=exp(coef(model2)[1]+y*coef(model2)[2]
and then write
lines(predict1)
to add the glm to your plot (you should get a curve if you've done it right). There's probably a better way but this is how I do it.
@Freidepfer 11 ปีที่แล้ว ⁺³
Do you provide any tutorial where you take more predictors into consideration?
E.g. working with 6-7 predictors which are influencing one "predicted" variable?
Besides that:
You really make great videos. Very explanatory and understandable (Like it that you also take paper and pen!)
@anettehlleland2269 7 ปีที่แล้ว
This is very helpful :)
I have one question about the video, I've recently started using R, and dose not understand when you put "as.Date(paste(1973,airquality$Month,airquality$Day,sep="-"))" and said "depending on your system", how do i know what system I use?
Thank you!
@RyanBenkeser 9 ปีที่แล้ว ⁺¹
Thank you very much
@25lobo 5 ปีที่แล้ว
A quite helpfull video.
@kosterix123 10 ปีที่แล้ว
yt viewers have the pause button available, so i don't need to see your typing errors. The video can be compressed. Good tutorial anyway, keep it going!
@kendramillard564 5 ปีที่แล้ว
This means, that once i introduce the wind*time interaction and correct for the autocorrelation, gls does not find a significant dependance between ozone and neither predictor, judged from the p-values given in summary(model6). Is this correct?
@mbanefomadukaife5958 9 ปีที่แล้ว
Thank you so much, please how can i generate several samples from a multivariate normal population and compute the sample Mahalanobis squared distance for each of them using R?
@ChristophScherber 9 ปีที่แล้ว
+Mbanefo Madukaife you use the mvrnorm function for this
@mbanefomadukaife5958 9 ปีที่แล้ว
+Christoph Scherber Thanks for much. Any existing code for that which can be run in the mvrnorm?
@LordFrega 5 ปีที่แล้ว
Just one question: why did you choose the poisson family when using the glm model?
@AirdropsManiac 6 ปีที่แล้ว
Nice videos. I find a better approximation formula for lnv! which is used in universities.
The formula they use is lnv! ≈ vlnv - v(If i remeber well). My semi-empiric formula is lnv! ≈ (v+1/2)lnv - v +1,08
@ChristophScherber 5 ปีที่แล้ว
thanks!
@tawilk 9 ปีที่แล้ว
at 16:40 roughly, wouldn't it just be e^b where b happens to be negative in this case since we see a decreasing/negative relation b/w wind and ozone.
@ChristophScherber 9 ปีที่แล้ว
+tawilk In a generalized linear model, all coefficients are on the scale of the linear predictor. Thus, you transform:
I: ln y= a + bx
II: y = exp (a+bx) = exp(a) * exp (bx)
a one-unit change in x will thus lead to a (exp b)-fold change in y.
@prasunbhattacharjee8415 9 ปีที่แล้ว
is the model given by GLM linear?
In lm model it was shown with plot in earlier video. Can we visualize something like that?
Also your presentation is very nice.
@liuyang2493 8 ปีที่แล้ว ⁺¹
Hi professor Scherber, thank you! I really enjoyed your video. Can you further show us how to implement negative binomial model if poisson model shows overdispersion?
@robertmarra883 8 ปีที่แล้ว
Dear Professor Scherber: Your videos are excellent! However, I'm wondering if you can point me in the right direction. I'm looking for a statistical method to compare modeled data against actual measured data, with the goal of calibrating the model. This isn't the place to go into the experimental details, but in brief, I'm trying to calibrate and validate a method my colleagues and I are developing for nondestructively estimating the amount of C stored in living trees. We used this method on a large sample of trees in a research forest, and want to compare estimates from our model to actual measurements obtained by felling the trees. Regressing the model estimates on the actual measurements produced a high correlation coefficient (~96%), a non-zero intercept less than 10% of the smallest model value, and a slope of nearly 1, and while this is nice to see, it doesn't really address the question of agreement between model estimate and measurement. Do you have any suggestions on how we might do this? One idea I have is to regress y-x against x, and then compare (how?) this against a y=x line.... but how to do this in R? I'm asking a lot...hope you can help! Thanks, Bob
@ChristophScherber 8 ปีที่แล้ว
Dear Rob, I think a regression of the predicted y (from linear models) vs. the observed y (from felled trees) sounds reasonable; you could then test if the slope differs significantly from 1.
@robertmarra883 8 ปีที่แล้ว
Thanks! Since I wrote you, this is exactly what I've tried, and it's quite interesting. I also calculated prediction intervals on the regression of the observed values on the predicted values, to see how well the latter can predict the former. I also used the Bland-Altman Limits of Agreement analysis.
@arnabroy7801 9 ปีที่แล้ว
Hi Christoph Scherber, Your videos are really helpful. Thanks a ton for the same. However I am facing a problem.I am unable to install the package MuMin for using AICc function. I am using R studio and R 3.1.2. it will be of great help if you can guide me to resolve the problem. Thanks in advance......
@ChristophScherber 9 ปีที่แล้ว
+Arnab Roy You may wish to write to the package maintainer if the problem is reproducible. MuMIn does interfer sometimes with other attached packages. Be sure to have the latest versions of R, RStudio and MuMIn. Best wishes, Christoph
@Jjzein 9 ปีที่แล้ว
is there any problem if we save R console in work space?
@jeffatennis 6 ปีที่แล้ว
should the windspeed and ozone measurements be normalized prior to evaluating models?
@ChristophScherber 5 ปีที่แล้ว
indeed it can be a good idea to scale variables to remove collinearity. But of course often the original scale of variables is interesting
@songbyungsub 9 ปีที่แล้ว ⁺¹
thank you!
@mdabdulhalim4422 9 ปีที่แล้ว ⁺¹
Great!
@Whowsworld 7 ปีที่แล้ว
Do you know how to do a GLM on statistica?
@askanalytics7097 8 ปีที่แล้ว
I had a question regarding Poisson Model that you performed on Ozone~Wind; While you used the model to control the negative predicted values of Y variable, the assumptions of Poisson model are not considered. The ozone content is not a discrete data which follows a poisson Distribution.
@ChristophScherber 8 ปีที่แล้ว
+Ask Analytics You are right that actually the variable is likely negative binomially distributed - as can be seen e.g. from
library(fitdistrplus)
plot(fitdist(as.numeric(na.exclude(airquality$Ozone)),"nbinom"))
While it is true that the ozone concentration in itself is not Poisson distributed, the generating measurement process clearly is a poisson process.
However, for the sake of simplicity, we model it as a Poisson glm here. Further details on negative binomial models (and how they may be used to account for overdispersion) will be added later.
Thank you very much for your comment!
@honey1761990 10 ปีที่แล้ว ⁺¹
Hi, Thanks for such a nice tutorial :)
I wish to ask one question related to Model 2 (Poisson). For example if we have two explanatory variables: one numerical and other categorical, then how we can estimate the value of Ozone?
Ozone ~ Wind+Time # Time can be 'Day' or 'Night'
Let's assume we get coefficients of Intercept as 2.3, Wind as 0.0065 and Time as 0.56.
We wish to calculate for: Wind=200 & Time=Day. so, equation will be:
Ozone_val=exp(coef(mod)[1]+coef(mod)[2]*0.0065+coef(mod)[3]* ...??
Should we multiply by 0 or 1 with coef(mod)[3]? What should we should choose?
Thanks & Best regards,
Chaitanya
@ChristophScherber 10 ปีที่แล้ว ⁺¹
This is usually a bit more complex to understand. If you have Wind+Time, separated by a "plus" sign, then you get the following estimates in the model summary:- The intercept for Wind=0 and Time="Day"- The difference in intercepts between "Day" and "Night"- the common slope for WindIt´s easiest to find out about these things using the "effects" package:require(effects)plot(allEffects(model)))Cheers,Christoph
@Pata_Negra 4 ปีที่แล้ว
@@ChristophScherber Very interesting package Christoph, thanks a lot for your video AND your comments :)
You're an awesome teacher !
@fazlihaleem6603 6 ปีที่แล้ว
I am running the following gls regression g.lm
@jonathanbruner3248 5 ปีที่แล้ว
What data set did you use, and where can I find it?
@ChristophScherber 5 ปีที่แล้ว
usually the datasets are loaded by using the data() command (the datasets are part of the base installation of R)
@getinettafesetucho7999 5 ปีที่แล้ว
Thank you!!!
@MrSimwawa 9 ปีที่แล้ว
I might be wrong but it seem to me that if
log(y) = a + bx + error
than, y = exp(a)*exp(bx)*exp(error)
as opposed to y = exp(a)*exp(bx) + exp(error)
Is there a reason why the error would be additive as you say in the video, and not multiplicative?
@ChristophScherber 9 ปีที่แล้ว ⁺³
simon venne The reason is that in a GLM the errors (e) are modeled separately from the linear predictor (a+bx). Mathematically, you are right; but the trick is that a GLM allows you to specify link function (for linear predictor) and variance functions (for errors) separately. Your equations are correct for a model where the response variable is transformed. In the case of a generalized linear model, things are a bit different, though.
@moth11110 9 ปีที่แล้ว ⁺¹
Thank you :)
@Miodiarioditrading200EMA 10 ปีที่แล้ว
hallo Professor!, Ich verstehe nicht, warum eine Variable ist besser als der andere. Haben Sie jedes Video, wo ich weiter erklären?
Entschuldigen Sie mein schlechtes Deutsch, ich bin Argentinien.
Grüße
@memoonaawan1326 8 ปีที่แล้ว
Hy can you please help me for L-moments in r????
@ChristophScherber 8 ปีที่แล้ว
Thanks, unfortunately I don´t have experience with L-moments. What do you need it for?
@pandumanne2518 7 ปีที่แล้ว
were GAM and GLM are same??
@ChristophScherber 5 ปีที่แล้ว
no, that´s different types of models
@brennanbeal2815 4 ปีที่แล้ว
Sorry, I believe your interpretations are a bit... misleading. At 16:33, you write that "In a glm, an individual slope gives an estimate of the multiplicative change in the response variable.... ". This is not always the case. This is the case for the Poisson distribution because the default link is a log link. But, you could easily run a glm with a Gaussian distribution and identity link, which would have an additive interpretation. Further, it is best practices to indicate your link, even when the default is intended (learned that the hard way :) ).
@melissad3390 8 ปีที่แล้ว
and you remind me of Stewie (the baby in Family Guy) which is awesome. :)
@Exit42A 11 ปีที่แล้ว ⁺¹
I thought this was a very good video, but you kind of hit the afterburners at the 2/3 mark.
@ChristophScherber 11 ปีที่แล้ว
Hi Exit, I´d be glad to hear how this video can be improved. Cheers!
@ayoadeyemo5271 9 ปีที่แล้ว
Christoph Scherber Thank you very much for this video....it was really useful for me
@TheAmibusy 10 ปีที่แล้ว
I code following in video but I can not get ACF plot . How to get it?
@ChristophScherber 10 ปีที่แล้ว
Where exactly do you get an error message?
@ChristophScherber 10 ปีที่แล้ว
ah, you need to replace the minus sign with a tilda sign (~).
Happy new year to you!

ต่อไป

เล่นอัตโนมัติ

Statistics with R (4) - Understanding contrasts and the model summary in R