why is finding information this clean and organized about statistics is so hard? even the textbooks have confusing languages, inconsistent notations etc. thank you so much for all your hard efforts. your videos are invaluable resources.
Hi Dr Jeremy Balka, The entire JB Statistics video series is a truly outstanding work. Many thanks for making your work public so that people like me can benefit from it. Cheers
OH MY GOD ! I FOUND THE HEAVEN ON TH-cam ! I was so scared to fail comm 215 as I was not understanding anything about this damn course and I found your chanel ! You are genius ! You save me as you explain things so well ! I am french, I am not used to technical english so I wasn't following well in lectures and thanks to you, I CAN NOW UNDERSTAND THESE DAMN CHAPTERS FOR THE FINALS ! THANK YOU SIR ! YOU ARE MY HERO !
Thank you very much for taking the time to do this, it is very much appreciated. All the concepts are perfectly explained and generally done much better than my 2 hour long university lectures!
You're an awesome Professor! There are not many people out there who would take the time to do this for their students. Thanks for making statistics easier to understand!
+Rupesh Wadibhasme Thanks for the compliment Rupesh! And thanks for the suggested topic. I do hope to get videos up on the likelihood function and maximum likelihood estimation, but time is a little short these days. All the best.
7:13 and if two points determine a line, once you know x, y, the mean value of x,y, then just use the slope to determine the next value of y for a given change in x above x-mean value, such that using the y intercept is not needed to make a second point? Sometimes x with a value of zero is not practical to assume either, as when you use x to be the price of an ounce of gold and y to be the price of ten oz.'s of copper, in a scatterplot.
Thank you for the very informative video that simplifies the complex phenomenon, making it easier to grasp the basics of regression. I am unsure about the time line- 4:33 to 4:35. While discussing the formula for calculating beta-one hat, I am confused about whether the denominator should be the sum of squares of X (SSxx) or the variance of X. Because if it is the variance of X, then SSxx is divided by n-1 (degrees of freedom).
Everything I say there is correct. The covariance is the sum of products divided by n-1, and the variance of x is the sum of squares of x divided by n-1. The n-1 terms cancel.
4:36 If we can solve for beta0 and beta1 using the equations beta0 = mean(y) - beta1(mean(x)) and beta1 = cov(x,y)/var(x). why should we use OLE instead?
We're not solving for beta_0 and beta_1, as they are parameters whose true values are unknown. We are solving for the least squares estimators of beta_0 and beta_1. At 4:36 I'm referring to the sample covariance of X and Y, and the sample variance of X, and just giving another way of expressing the formula we just derived.
Doesn't everything seem that way at the time. Speaking of which, after all those hours of studying and review, guess who forgets to bring a calculator to the exam last night. This guy.
Hi, im wondering how to approximate the unknown b in a Rayleigh-distributed random variable using least squares having some values that the random variable take. Is it possible to give a short explanation of that?
Hi, thank you for this video, one question, in 6:44, the videos gives residue must sum to zero for least square regression, why is that? The residue is just minimized that could be non-zero, can you explain that?
Suppose that the average of the residuals was 2 (the sum would be 2 times however many points there are). That means you could move the line up vertically by 2 and have a better fit to the data points. For a simple example, imagine two points: one with a residual of 4, and another with a residual of 0 (it is on the regression line). Then the sum of the residuals is 4, and the mean of the residuals is 2. But we can do better than this by moving the regression line up to go between these points (rather than directly through one of them). In that case, the residuals would become -2 and 2, respectively, and their sum would be 0. You can see this also by looking at the sum of the squares of the residuals. In this case, the sum of the squares of the residuals is 0^2+4^2 = 16. That is large compared to what we get if we move the line up by 2 so that it goes between the two points. Then the sum of the squares of the residuals is (-2)^2+2^2 = 8. This is really easier to illustrate by drawing points and lines, so I hope you try that yourself.
Could you explain why « (Sx)^2 », « (Sy)^2 » and « Cor(x,y) » are divided by « n-1 », and not just « n » ? and by the way your videos are the best explanation on this subject ! Definitely a life saver. Keep on the good work =D
Thanks for the compliment! I have a video that discusses the one sample case: The sample variance: why divide by n-1. It's available at th-cam.com/video/9ONRMymR2Eg/w-d-xo.html
Hi i like your videos. I had a question. I know that the values you list for b1 and bo work when the errors follow N(0,var(x)). My question is what would the least squares estimators for b0 and b1 be if the errors follow N(0,2x)
The least squares estimators are the least squares estimators -- they are the same formulas regardless of the distribution of errors.. The *properties* of the least square estimators depend on what the distribution of the errors is. Are you asking what would happen if the variance in the epsilons increases with X? If there is increasing variance, and we ignore that, then the resulting least squares estimators (the usual formulas) will still be unbiased, but the reported standard errors will be smaller than they should be.
Hi i like your videos. I had a question. I know that the values you list for b1 and bo work when the errors follow N(0,var(x)). My question is what would the least squares estimators for b0 and b1 be if the errors follow N(0,2x).
The least squares estimators are still the least squares estimators, regardless of whether the variance of y is constant or has some relationship with x. If we use our regular least squares estimators in a situation where the variance of y is non-constant, then the estimators are still unbiased but the standard errors will be off (and we thus may have misleading conclusions in our statistical inference procedures). If the assumptions of the model are all met, except for the fact that the variance of y is changing with x, then weighted regression will take care of that. In weighted regression, the notion is that points that have a high variance in the random variable y contain less information, and thus should receive less weight in the calculations. We typically weight by the inverse of the variance.
Epsilon represents the theoretical random error component (a random variable). The residuals are the differences between the observed and predicted values of Y.
So epsilon is basically a Random variable which takes the disturbance (values) from the mean (the regression line) and a Residual is an element of a Random error component? In other words, a Residual is a subset of Random Error Component? Also, a Residual is one of the many disturbances from the regression line for a given X? pls correct me if I am wrong
Lixi W I don't enable ads for a number of reasons. The main one is that I'm simply trying to help people learn statistics, and forcing people to watch 5 seconds of an ad before getting some help just feels wrong. And the amount of revenue would be pretty small (forcing people to watch a video ad 3 million times just so I can get $2k or so taxable dollars just doesn't add up to me).
why is finding information this clean and organized about statistics is so hard? even the textbooks have confusing languages, inconsistent notations etc. thank you so much for all your hard efforts. your videos are invaluable resources.
I absolutely agree with you
Hi Dr Jeremy Balka,
The entire JB Statistics video series is a truly outstanding work.
Many thanks for making your work public so that people like me can benefit from it.
Cheers
OH MY GOD ! I FOUND THE HEAVEN ON TH-cam !
I was so scared to fail comm 215 as I was not understanding anything about this damn course and I found your chanel !
You are genius ! You save me as you explain things so well ! I am french, I am not used to technical english so I wasn't following well in lectures and thanks to you, I CAN NOW UNDERSTAND THESE DAMN CHAPTERS FOR THE FINALS ! THANK YOU SIR ! YOU ARE MY HERO !
Good bless you Sir. U have d most easy to follow Explanatory Statistics channel on TH-cam. Wish I could rate more than 5 stars. Thank you so much ❤️
Thank you very much for taking the time to do this, it is very much appreciated. All the concepts are perfectly explained and generally done much better than my 2 hour long university lectures!
You are very welcome. I'm glad you found my video helpful!
Thank you so very much for making your lectures available. It is very helpful getting these excellent explanation at my own pace.
You're very welcome. I'm glad to be of help!
I decide to stick to this channel. Very closely related to reality, logically explained, and useful. Wish I had found out sooner.
Thank you for saving my Probability and Statistics course grades!
It's wonderful to have your materials in addition to my lectures. It's simple to understand and very helpful indeed. Thank you
You are very welcome! I'm glad you find my videos helpful.
You're an awesome Professor! There are not many people out there who would take the time to do this for their students. Thanks for making statistics easier to understand!
Love your channel. Thank you for the hard work!
+Jrnm Zqr You are very welcome. I'm glad I could be of help!
Thanks man, love the simplicity of you videos! Cheers
Thanks! It's so much easier to learn statistics with your help!
your voice is so cool
Thanks!
I was so confused about regression but now it seems very simple
Thank you,
Thank you for making these videos they were extremely helpful in my learning of the content.
+juji432 You are very welcome! All the best.
Thank you so much!! I didn't know error was assumed to be normally distributed so I was confused for the longest time
I'm glad to be of help!
solidarity for Canadians who call zero 'nought'
Thanks a lot sir. The most informative video i seen on entire you tube. Please provide video on "The Likelihood function" also.
+Rupesh Wadibhasme Thanks for the compliment Rupesh! And thanks for the suggested topic. I do hope to get videos up on the likelihood function and maximum likelihood estimation, but time is a little short these days. All the best.
7:13 and if two points determine a line, once you know x, y, the mean value of x,y, then just use the slope to determine the next value of y for a given change in x above x-mean value, such that using the y intercept is not needed to make a second point? Sometimes x with a value of zero is not practical to assume either, as when you use x to be the price of an ounce of gold and y to be the price of ten oz.'s of copper, in a scatterplot.
Outstanding, very clear.
Didn't know about the last part, great explanation!
Thanks!
Great help for finals week
Thanks! This project has just about killed me, but it seemed like a good idea at the time :)
Thank you for the very informative video that simplifies the complex phenomenon, making it easier to grasp the basics of regression. I am unsure about the time line- 4:33 to 4:35. While discussing the formula for calculating beta-one hat, I am confused about whether the denominator should be the sum of squares of X (SSxx) or the variance of X. Because if it is the variance of X, then SSxx is divided by n-1 (degrees of freedom).
Everything I say there is correct. The covariance is the sum of products divided by n-1, and the variance of x is the sum of squares of x divided by n-1. The n-1 terms cancel.
@@jbstatistics thank you for clarifying.
Your videos very helpful. Big thanks
You are very welcome!
_You don't yet know how to fit that line but I do_
Thanks for making statistics kinda fun :)
you are a big help! oh my goodness! thank you so much!!! :)
Thanks!
4:36
If we can solve for beta0 and beta1 using the equations beta0 = mean(y) - beta1(mean(x)) and beta1 = cov(x,y)/var(x). why should we use OLE instead?
We're not solving for beta_0 and beta_1, as they are parameters whose true values are unknown. We are solving for the least squares estimators of beta_0 and beta_1. At 4:36 I'm referring to the sample covariance of X and Y, and the sample variance of X, and just giving another way of expressing the formula we just derived.
You're welcome!
Merci!
Great video, cheers!
Doesn't everything seem that way at the time. Speaking of which, after all those hours of studying and review, guess who forgets to bring a calculator to the exam last night. This guy.
Excellent explanation, but What is the interpretation of model equation?
Hi, im wondering how to approximate the unknown b in a Rayleigh-distributed random variable using least squares having some values that the random variable take. Is it possible to give a short explanation of that?
your videos are amazing may ALLAH bless you.
Hi, thank you for this video, one question, in 6:44, the videos gives residue must sum to zero for least square regression, why is that? The residue is just minimized that could be non-zero, can you explain that?
Suppose that the average of the residuals was 2 (the sum would be 2 times however many points there are). That means you could move the line up vertically by 2 and have a better fit to the data points. For a simple example, imagine two points: one with a residual of 4, and another with a residual of 0 (it is on the regression line). Then the sum of the residuals is 4, and the mean of the residuals is 2. But we can do better than this by moving the regression line up to go between these points (rather than directly through one of them). In that case, the residuals would become -2 and 2, respectively, and their sum would be 0. You can see this also by looking at the sum of the squares of the residuals. In this case, the sum of the squares of the residuals is 0^2+4^2 = 16. That is large compared to what we get if we move the line up by 2 so that it goes between the two points. Then the sum of the squares of the residuals is (-2)^2+2^2 = 8. This is really easier to illustrate by drawing points and lines, so I hope you try that yourself.
hi, I dont understand why B1 is the SPxy/SSxx, can you please explain?
th-cam.com/video/ewnc1cXJmGA/w-d-xo.html
Could you explain why « (Sx)^2 », « (Sy)^2 » and « Cor(x,y) » are divided by « n-1 », and not just « n » ? and by the way your videos are the best explanation on this subject ! Definitely a life saver. Keep on the good work =D
Thanks for the compliment! I have a video that discusses the one sample case:
The sample variance: why divide by n-1. It's available at th-cam.com/video/9ONRMymR2Eg/w-d-xo.html
Thank you very much for your reply. So kind ! I'll watch it =D
Good videos,,,,-,,,,,I learn a lot and clear my concept easily
I'm glad to be of help!
very nice...thanx.
Hi i like your videos. I had a question. I know that the values you list for b1 and bo work when the errors follow N(0,var(x)). My question is what would the least squares estimators for b0 and b1 be if the errors follow N(0,2x)
The least squares estimators are the least squares estimators -- they are the same formulas regardless of the distribution of errors.. The *properties* of the least square estimators depend on what the distribution of the errors is. Are you asking what would happen if the variance in the epsilons increases with X? If there is increasing variance, and we ignore that, then the resulting least squares estimators (the usual formulas) will still be unbiased, but the reported standard errors will be smaller than they should be.
Thanks
Hi i like your videos. I had a question. I know that the values you list for b1 and bo work when the errors follow N(0,var(x)). My question is what would the least squares estimators for b0 and b1 be if the errors follow N(0,2x).
The least squares estimators are still the least squares estimators, regardless of whether the variance of y is constant or has some relationship with x. If we use our regular least squares estimators in a situation where the variance of y is non-constant, then the estimators are still unbiased but the standard errors will be off (and we thus may have misleading conclusions in our statistical inference procedures). If the assumptions of the model are all met, except for the fact that the variance of y is changing with x, then weighted regression will take care of that. In weighted regression, the notion is that points that have a high variance in the random variable y contain less information, and thus should receive less weight in the calculations. We typically weight by the inverse of the variance.
Okay thanks, so under the weighted transformation the estimator for B1 would be (x'wx)^(-1) x'wy where the w matrix has (1/2xi)^2 for its diagonals?
I love my jbstatistics, my superhero
What's the difference between *Random error component* and *Residuals*?
Epsilon represents the theoretical random error component (a random variable). The residuals are the differences between the observed and predicted values of Y.
So epsilon is basically a Random variable which takes the disturbance (values) from the mean (the regression line) and a Residual is an element of a Random error component?
In other words, a Residual is a subset of Random Error Component?
Also, a Residual is one of the many disturbances from the regression line for a given X?
pls correct me if I am wrong
Can you do that in Excel?
hi jb, just curious, why all your videos dose not have ad? I love it !!!!!
Lixi W I don't enable ads for a number of reasons. The main one is that I'm simply trying to help people learn statistics, and forcing people to watch 5 seconds of an ad before getting some help just feels wrong. And the amount of revenue would be pretty small (forcing people to watch a video ad 3 million times just so I can get $2k or so taxable dollars just doesn't add up to me).
S / O to @jbstatistics for not being a sell out!
+Luis C Thanks Luis!
so does the computer just guess at random?
I don't know what you're asking. If you clarify, I might be able to answer. Cheers.
@5:00 what formula does the computer use to identify the slope/intercept of y?
The software calculates the sample slope and intercept using the formulas I discuss earlier in the video (at 4:09).
the sum of PRODUCTS
My teacher uses b0 & b1 as â and b^
we don't know how to fit the line but I DO. LOL
I love your videos, which consists of concise knowledge structure and sexy voice >.
Beauty
hahaha today I found out we go to the same university professor Balca xD
Yes, that happens a lot :)
dude sounds like max kellerman lol
tmkc
This was not your best
Thanks!
You're welcome!
thanks