Now I know where my students are getting inaccurate information from. The difference between z and t has nothing to do with the sample size. It has everything to do with whether or not you are using a known population standard deviation or whether you are estimating it by using the standard deviation calculated from the sample. IF YOU ARE USING A SAMPLE STANDARD DEVIATION TO CALCULATE THE STANDARD ERROR (estimate of standard deviation of the sampling distribution), THEN YOU USE A T-STATISTIC, REGARDLESS OF SAMPLE SIZE. Period. The reason for using t is to correct for the extra variability that is added when using an estimate from a sample rather than a known population parameter. Yes, as the sample sizes get large there is very little difference between the values of z and t, so a z table could be used as a good approximation. But we use computers now so the correct test to choose would be a t-test. (I know this has been taught incorrectly for years, but that is no reason to continue to do so.)
Well. Perhaps you could educate your teacher. It is a very common misconception. I was taught incorrectly when I studied it in school. Yes, some textbooks still show that incorrectly. The thing is, with sample sizes over 30, the difference between z and t is not so large and won't matter for most cases. Still not an excuse for teaching it incorrectly. Ask your teacher WHY a t distribution is used. Best wishes on your exam.
Excellent! Thank you! You're getting me through Applied Statistics for Psychology. You got me through fundies about 10 years ago too. Lol! I ended up really enjoying stats, and that was before I knew there were programs for this kind of thing. I was solving everything on paper. Phew! You make everything easy to follow.
I would like to clarify a few things: "sample distribution" = "sampling distribution of the sample means" = "distribution the mean". Check wikipedia. Thanks to the Central Limit Theorem we know it is normal. In the video we have two means. The one is the center is the mean of all means and the other is just a mean of one sample group. So this "sample distribution" has also a standard deviation that is calculated by the formula given by Khan. By assuming that the H0 is true, having parameters/statistics from you control group and treated group, you usually have everything to calculate the z score and draw a conclusion.
@TheQuietStormX No. I teach statistics and have a Ph.D. Many texts have this point incorrect. If you have sigma, then the test statistic (x-bar - mu)/(sigma/sqrt(n)) either follows a standard normal distribution (if the population is normally distributed), or is approximately normal if n is sufficiently large. The t arises because the test statistic has not one, but two random variables (x-bar and s). The added variability from s generates the fatter tails of the t distribution.
standard deviation for any one sample is 'sample standard deviation'. if sampling is repeated multiple times, we get a sampling distribution (...which follows a normal distribution as per CLT). The standard deviation estimate of sampling distribution gives 'standard deviation of the sampling distribution'.
What would be great would be if you could explain why your comment is true with the same clarity that Khan brings to the subject. I don't doubt that it's possible that Khan is wrong, but the whole reason his videos are popular is because he is able to to make the subject understandable where their professors have failed to do so.
@NSJ218 I think when you refer to the "n is large enough rule", you're referring to the n>30 rule(?) which agreed, has nothing to do with the CLT. However, the CLT speaks directly about how the sums (or means) of iid random variables distribute, and thus speaks to how the distribution of the sample mean distributes as n approaches infinity.
If you are using (x-bar - mu)/(sigma/sqrt(n)), then sigma must be known and the test statistic follows a standard normal distribution. If you don't know sigma, and use s, then you have (x-bar - mu)/(s/sqrt(n)), which is distributed as a t, with n-1 degrees of freedom. But, to answer your question, if the population is not normally distributed, but your sample size is large enough, then you are invoking the CLT, whether you are using a standard normal or a t.
NSJ218 is right! To choose z depends on the assumptions of normality (CLT is invoked) and when sigma is known. To choose t or z does not depends on sample size alone.
@NSJ218 I think you are referring to when he misspoke at 5:00. He says that "when this is small" and points to "s". It's pretty clear he meant to say "when n is small" since he is already talking about the n>30 rule. Also, it's a little weird to say, use t if you don't know sigma. The CLT is essentially telling us that we don't need to if our sample is large enough (unless you needed the answer to be* extremely* accurate).
This is an excellent presentation. For all practical purposes, if N < 30, use the "t" table. Since the "P" value is so easy to use in "Excel", there is no reason to just the "Z" statistic.
If you know the standard deviation, then you would be overestimating needlessly. It is not about the sample size. (The "30" comes from the fact that as the sample size increases, the z and t distributions are almost the same [this can be seen from t and z tables. However, if the population is normally distributed, then a sample from that population is more likely to follow the normal distribution shape with a sample size less than 30, so there is no need to use the larger t value.)
I think the Z statistic is used when you know mean and sigma and T statistic when you don't know sigma. However, when in both cases you do know sigma (whether by really knowing or estimation) you diferenciate both because of the size on n. (sorry my english sucks)
Well written and spoken. The "small N" test is of great value to verify the "Research Hypothesis". If you have a small group of men, and they must conduct their own preventive study - the results can be very convincing. Thanks!
Thanks for the video. I'm in a basic statistics class and I was having trouble with determining when to use one over the other. My teacher just kept spouting off jibberish over and over hoping I'd eventually get what he was talking about but this summed it up nicely.
@palui That's not what the CLT is saying. Some texts are incorrect on this point. If you use s, it introduces additional randomness in the test statistic. As n gets large, s targets sigma with less variability and a t approaches a z. But the 'n is large enough' rule is not referring to this point. It is referring to the distribution of the sample mean.
@@HAO-io6pb You don't need s if you have the actual std deviation. You can never have a situation where you can't calculate "s" for sample data, but regardless there is no need to if you have the actual std deviation. Use z in this case.
Thanks for that explanation. I find that books on statistics do not always make the difference between this very clear, or do not explain why things change when the sample size gets to 30.
It's refreshing to see NSJ218's comments. I am a statistics instructor and I am not convinced statistics should be taught to students that don't know calculus WELL.
Formula is basically the same, but you will use different tables (z-table or t-table), depends on the sample size (sample > 30 => z-table, sample < 30 => t-table )
@@arsenyturin hi, I wanna ask why don't we just use t-statistic? Would it be accurate all the time? Or is there some sort of unknown statistical anomaly for sample size greater than 30 that we need to use z statistics? hahaha
@@oneinabillion654 t table essentially widens your answer's interval... z table would create a more narrow interval so would be better if you can do it (if sample size is greater than 30)
Question: sigma/sqrt(n) is the sample standard deviation, if we do not know sigma but we know s (sample SD) why are we dividing it by sqrt(n) when sigma/sqrt(n) is sample SD only
Substantively, this is a very effective video in explaining the concept, but I found the speaker’s tendency to repeat himself (literally repeating phrases unnecessarily) pretty irritating, because it’s something that could have been edited prior to posting.
I think there is some relationship between t and n. From the deviation of t-distribution by bayesian statistic, the assumption is that mu is conditionally normal with mean xbar. To make this hold, n>30 to apply CLT. And when n is large enough (n>100 maybe), t is very similar to z. As t is harder to calculate than z, usually when n>100 z will be used instead of z.
Wrong! The derivation of the SE of the mean is not a result of the CLT. It is the result of the sample being i.i.d. random variables, along with some basic math stats. The correct rules are: (1) if the population is not normal, then you need the sample size to be of sufficient size (general guidelines are >30) to invoke the CLT to state that the distribution of the sample mean follows a normal distribution. (2) If sigma is know, use z; if not, use t. It's not related to n.
I hope someone can answer this because I'm kinda confused on the vocabulary he used.... What is the difference between the standard deviation of the sampling distribution and the standard deviation of the sample?
Sampling distribution of a parameter is the distribution of that parameter. Sample distribution applies to data, sampling distribution applies to parameters that you compute using the data. For instance, you can have a distribution of heights of students in your group. If you calculate the mean height of your group, and go on to take more samples and mesure the mean height in each sample and then construct a histogram of mean heights you will get to see a normal distribution of the sample means or sampling distribution of mean heights...
the standard deviation of the sampling distribution aka standard error is the 'spread' of the statistic of concern (usually the mean). the sample standard deviation is the spread of the individually observed values which the statistic is calculated from. the former is the latter divided by the square root of the sample size. an example - if you had a sample of billions of people, the standard error of the mean height would be tiny, because if you repeatedly selected a different sample of that size and averaged everyone's heights, you would not get a mean very different. However, the range of individual heights in your sample would probably vary greatly i.e. the sample standard deviation is large
No, you shouldn't use the t if you know sigma. The t is based on the fact that the denominator is random, not fixed. But your question is a good one, because what if your parent population was far from normal, you had a sample of size n=6 but you knew sigma. The sampling distribution for n=6 wouldn't be very normal so to use a z would be fatal. Using a t wouldn't be correct either because you know sigma. In this case a larger sample size would be the best way out.
Hmm, I'll try applying what I've learned. That should be warning enough that I'm not necessarily correct. Sigma isn't given, only Xbar. However, since the sample is large enough to be covered by the CLT (>30), then a z-distribution is still valid. I think.
Just like palui, bigger sample size is usually better. Otherwise, you can approximate a distribution for the sample, which is much harder and requires more skills and experience.
The symbols on the numerator of the z-statistic should be the reverse, right? It is mean of sample distribution divided minus the expected mean (x bar).
You yourself say that you need to invoke the CLT to state that the distribution of the sample mean follows a normal distribution. Then you are invoking the CLT indirectly to t show that the test statistic is (x-bar - mu)/(sigma/sqrt(n))??
They essentially "not exact" equate to eachother when get many multiple of large samples n>30 that create normal distribution that is not biased or skewed left or right.
I like this guy, i like this guy, his presentations are excellent, excellent. i do wish that he could break the habit, break the habit, of such FREQUENT repetition -- surely he is unaware of how distracting this is, It becomes, becomes annoying.....
nice video! so we use the z statistic when the sample size is more than 30 and we use the t-statistic when sample size is less than 30? Is the sample size the only determining factor?
I thought the use of z statistic or t statistic depended on whether or not sigma is known; z statistic is used when the population parameters (mean and sigma) are known while t statistic is used when the sigma or/and population mean is unknown. (Sample characteristics only matter in determining what type of t statistics you are using)
Not necessarily... if you're estimating the population proportion from an experiment (binomial) you can estimate the Population Standard Deviation based on the results of the experiment -> SD/SQRT(n) -> use Z.
This video re-enforces a common misconception about the use of the t-distribution. The sample size actually has nothing to do with it. It's true that if you're a student taking a basic stat class that a small sample size indicates that your teacher expects you to "pick" the t; however, that reasons is flat out wrong and in real life this will get you in trouble. Likewise, if your n is over 30, but sigma is unknown, it is still the t and not z (although, this is a much less severe mistake).
@angelusp777 Use Student's t distribution. If you compare the critical values from a t distribution to those from the z, you will see that they are not identical until n gets relatively large.
A sample of 43 students from the agriculture faculty take a Scholastic Aptitude Test the sample has a mean of 520 and a standard deviation of 8. Construct a 95% confidence interval that contains the true population parameter. would you use the t-distribution for this question? thanks!
you need infinite degrees of freedom for t- distribution to be normal, one cannot say if greater than 30 then it will be normal because at 100 degrees of freedom also t-distribution is less than normal . For normal you get value of 1.96 for which infinte degrees of freedom is required not only greater than 30 . This is wrong
I don't understand what the difference is between the Standard Deviation of the Sampling distribution (1:43) and the Sample standard deviation 's' (3:48) ?? Anyone?
Can anyone please help me in avoiding the confusion from a simple problem in statistics which is haunting me from few days... Q). A sample survey of tax payers belonging to business class and professional class yielded the following results: Business class Professional class Sample size n1 = 400 n1 = 420 Defaulters in tax payment x1 = 80 x1 = 65 Test the hypothesis at α = 0.01 level of significance that defaulter rate is the same for the two classes of tax-payers. I will be thankful to you if you can help me coming out of this in identifying null and alternate hypothesis and standard deviation.
So we don't know the standard deviation of the sample or the population. Then we estimate the population standard deviation using the sample standard deviation which we don't have??
At n=40, the critical value for t at a 0.05 significance level is 2.023. For the z, it is 1.96. Using 1.96, when it is really 2.023 is voluntarily introducing a 3% error. I would hardly call that a good approximation. If you think it is, can I borrow $2023 dollars from you today, and repay you $1960 tomorrow? By your logic, they are approximately the same.
I'm 95% confident my grade will be between 50 and 90
underated comment
He did the normal distribution
Well I got a 53 💀
That's a lot of variance in your data!
Hopefully.
my test tomorrow and my brain is empty ur my last hope
same in 2021
@@ren-chon6123 Same
I think it would be important that we understand Standard Deviation of sampling size first.
@@StephtherefAmazing same in 2021 may
SAME GUYS HAHA
Now I know where my students are getting inaccurate information from. The difference between z and t has nothing to do with the sample size. It has everything to do with whether or not you are using a known population standard deviation or whether you are estimating it by using the standard deviation calculated from the sample.
IF YOU ARE USING A SAMPLE STANDARD DEVIATION TO CALCULATE THE STANDARD ERROR (estimate of standard deviation of the sampling distribution), THEN YOU USE A T-STATISTIC, REGARDLESS OF SAMPLE SIZE. Period.
The reason for using t is to correct for the extra variability that is added when using an estimate from a sample rather than a known population parameter. Yes, as the sample sizes get large there is very little difference between the values of z and t, so a z table could be used as a good approximation. But we use computers now so the correct test to choose would be a t-test. (I know this has been taught incorrectly for years, but that is no reason to continue to do so.)
wow this is terrifying, back to my notes
In my "Engineering Statistics by Montgomery" textbook it says the same thing as Sal
Well. Perhaps you could educate your teacher. It is a very common misconception. I was taught incorrectly when I studied it in school. Yes, some textbooks still show that incorrectly. The thing is, with sample sizes over 30, the difference between z and t is not so large and won't matter for most cases. Still not an excuse for teaching it incorrectly. Ask your teacher WHY a t distribution is used. Best wishes on your exam.
Geez take a breath
Biostats for graduate level Epidemiology teaches the same as Sal also.
Excellent! Thank you! You're getting me through Applied Statistics for Psychology. You got me through fundies about 10 years ago too. Lol! I ended up really enjoying stats, and that was before I knew there were programs for this kind of thing. I was solving everything on paper. Phew! You make everything easy to follow.
I would like to clarify a few things:
"sample distribution" = "sampling distribution of the sample means" = "distribution the mean". Check wikipedia. Thanks to the Central Limit Theorem we know it is normal. In the video we have two means. The one is the center is the mean of all means and the other is just a mean of one sample group. So this "sample distribution" has also a standard deviation that is calculated by the formula given by Khan.
By assuming that the H0 is true, having parameters/statistics from you control group and treated group, you usually have everything to calculate the z score and draw a conclusion.
this is a terrible video, he doesnt define the assumptions
You meant if n is small then we have a t distribution but pointed to s at 4:50,
you should put a note to avoid confusing anyone
Great video overall!
this video explains it better than the 75 pages in my stats book. thank you!
I just watched an hour video from my class and you just explained it super easy.
I dont care. this man makes videos to teach us stuff that we dont even understand from our own teachers and textbooks. THANK YOU
Fantastic video, very well explained. Currently in an MBA stats class and this video cleared some things up. Thanks
me2
FINALLY a clean and short yet precise explanation. Chapeau!
Some parts of the t distribution troubled me a lot and now I finally figure it out. Thanks for sharing!
gonna fail exam tmrw FML
+TheKijib did you fail? im currently sweating bullets over here asgjkhkl
what about you? I got mine tomorrow
Mine is in 6 hours. See you all on the other side.
Mine is in 7hours 23 minutes and I’m so gonna fail. I am pulling an all-nighter .
did anyone fail/pass?
exams in 4 hours weeeeooooooooo
Mackenzie Kyryluk 😂😅 sameeeee
Haha
@TheQuietStormX
No. I teach statistics and have a Ph.D. Many texts have this point incorrect. If you have sigma, then the test statistic (x-bar - mu)/(sigma/sqrt(n)) either follows a standard normal distribution (if the population is normally distributed), or is approximately normal if n is sufficiently large. The t arises because the test statistic has not one, but two random variables (x-bar and s). The added variability from s generates the fatter tails of the t distribution.
whats the difference between 'sample standard deviation' and the 'standard deviation of the sampling distribution'?
standard deviation for any one sample is 'sample standard deviation'. if sampling is repeated multiple times, we get a sampling distribution (...which follows a normal distribution as per CLT). The standard deviation estimate of sampling distribution gives 'standard deviation of the sampling distribution'.
STANDARD DEVIATION OF SAMPLING DISTRIBUTION is the standard deviation of the means of random samples taken repeatedly.
@@catalystamlan I knew the answer but your reply is heartwarmingly well-written I had to re-read it again ♥
What would be great would be if you could explain why your comment is true with the same clarity that Khan brings to the subject. I don't doubt that it's possible that Khan is wrong, but the whole reason his videos are popular is because he is able to to make the subject understandable where their professors have failed to do so.
@NSJ218 I think when you refer to the "n is large enough rule", you're referring to the n>30 rule(?) which agreed, has nothing to do with the CLT. However, the CLT speaks directly about how the sums (or means) of iid random variables distribute, and thus speaks to how the distribution of the sample mean distributes as n approaches infinity.
If you are using (x-bar - mu)/(sigma/sqrt(n)), then sigma must be known and the test statistic follows a standard normal distribution. If you don't know sigma, and use s, then you have (x-bar - mu)/(s/sqrt(n)), which is distributed as a t, with n-1 degrees of freedom. But, to answer your question, if the population is not normally distributed, but your sample size is large enough, then you are invoking the CLT, whether you are using a standard normal or a t.
NSJ218 is right! To choose z depends on the assumptions of normality (CLT is invoked) and when sigma is known. To choose t or z does not depends on sample size alone.
@NSJ218 I think you are referring to when he misspoke at 5:00. He says that "when this is small" and points to "s". It's pretty clear he meant to say "when n is small" since he is already talking about the n>30 rule. Also, it's a little weird to say, use t if you don't know sigma. The CLT is essentially telling us that we don't need to if our sample is large enough (unless you needed the answer to be* extremely* accurate).
At 4:51, Sal writes an arrow towards s but it shall have been towards the n.
Thank you for your wonderful videos! I'm taking a Quantitative Analysis & Decision Making class and your videos have been so helpful.
Are you still alive?
It's 12 am and I just got off work, I don't pay attention at all in stats but I have a test tomorrow. Let's see how well you can teach me this lol
Very good basic teaching , you built it from the beginning so people can understand thank you
This is an excellent presentation. For all practical purposes, if N < 30, use the "t" table. Since the "P" value is so easy to use in "Excel", there is no reason to just the "Z" statistic.
If you know the standard deviation, then you would be overestimating needlessly. It is not about the sample size. (The "30" comes from the fact that as the sample size increases, the z and t distributions are almost the same [this can be seen from t and z tables. However, if the population is normally distributed, then a sample from that population is more likely to follow the normal distribution shape with a sample size less than 30, so there is no need to use the larger t value.)
Yes, you sir are right!
If sigma is know, use z; if not, use t. It's not related to n.
I think the Z statistic is used when you know mean and sigma and T statistic when you don't know sigma. However, when in both cases you do know sigma (whether by really knowing or estimation) you diferenciate both because of the size on n. (sorry my english sucks)
Well written and spoken. The "small N" test is of great value to verify the "Research Hypothesis". If you have a small group of men, and they must conduct their own preventive study - the results can be very convincing. Thanks!
Thanks for the video. I'm in a basic statistics class and I was having trouble with determining when to use one over the other. My teacher just kept spouting off jibberish over and over hoping I'd eventually get what he was talking about but this summed it up nicely.
@palui
That's not what the CLT is saying. Some texts are incorrect on this point. If you use s, it introduces additional randomness in the test statistic. As n gets large, s targets sigma with less variability and a t approaches a z. But the 'n is large enough' rule is not referring to this point. It is referring to the distribution of the sample mean.
this man has been my tutor since high school now I'm in uni lol
Summary:
Use Z-test if n>=30
Use 1 sample t-test if n
what about you just have "σ" , don't have "s" and n
Use the Z statistic
@@HAO-io6pb You don't need s if you have the actual std deviation. You can never have a situation where you can't calculate "s" for sample data, but regardless there is no need to if you have the actual std deviation. Use z in this case.
I hv the same question as @HAO can we really use z statistic eith less than 30 sample size
Yay I wish you were MY stat teacher!!
Thanks for that explanation. I find that books on statistics do not always make the difference between this very clear, or do not explain why things change when the sample size gets to 30.
Im so happy each time i need some math information, google it and come to Khan vids
It's refreshing to see NSJ218's comments. I am a statistics instructor and I am not convinced statistics should be taught to students that don't know calculus WELL.
Thank you❤ you just blessed my soul ….. clarity 😅
am i on crack or is there no difference
Formula is basically the same, but you will use different tables (z-table or t-table), depends on the sample size (sample > 30 => z-table, sample < 30 => t-table )
@@arsenyturin hi, I wanna ask why don't we just use t-statistic? Would it be accurate all the time? Or is there some sort of unknown statistical anomaly for sample size greater than 30 that we need to use z statistics? hahaha
because when your degree of freedom is large enough , both statistic are very close , lol
@@oneinabillion654 It doesn't depend on the sample size, but whether the variance is known or not.
@@oneinabillion654 t table essentially widens your answer's interval... z table would create a more narrow interval so would be better if you can do it (if sample size is greater than 30)
Very nicely Explained🙏🙏🙏❤
Excellent! I like the detail of the calculations.
Question: sigma/sqrt(n) is the sample standard deviation, if we do not know sigma but we know s (sample SD) why are we dividing it by sqrt(n) when sigma/sqrt(n) is sample SD only
Greatest and easiest explanation thanks sir......
I couldnt understand the difference between st dev of sampling distribution and st dev of sample st deviation.. I mean "sigma_x" and "s"...
It might be too late for someone to answer this but are you taking about the standard error of the mean and standard deviation of samples?
Substantively, this is a very effective video in explaining the concept, but I found the speaker’s tendency to repeat himself (literally repeating phrases unnecessarily) pretty irritating, because it’s something that could have been edited prior to posting.
I think there is some relationship between t and n. From the deviation of t-distribution by bayesian statistic, the assumption is that mu is conditionally normal with mean xbar. To make this hold, n>30 to apply CLT. And when n is large enough (n>100 maybe), t is very similar to z. As t is harder to calculate than z, usually when n>100 z will be used instead of z.
Here S is standar deviation of any sample population or the mean of sample population
Wrong! The derivation of the SE of the mean is not a result of the CLT. It is the result of the sample being i.i.d. random variables, along with some basic math stats. The correct rules are: (1) if the population is not normal, then you need the sample size to be of sufficient size (general guidelines are >30) to invoke the CLT to state that the distribution of the sample mean follows a normal distribution. (2) If sigma is know, use z; if not, use t. It's not related to n.
I hope someone can answer this because I'm kinda confused on the vocabulary he used....
What is the difference between the standard deviation of the sampling distribution and the standard deviation of the sample?
i think sd of sampling distribution is the sd of all samples in A distribution, whereas sd of THE sample is just the sd of one of the samples.
Sampling distribution of a parameter is the distribution of that parameter. Sample distribution applies to data, sampling distribution applies to parameters that you compute using the data. For instance, you can have a distribution of heights of students in your group. If you calculate the mean height of your group, and go on to take more samples and mesure the mean height in each sample and then construct a histogram of mean heights you will get to see a normal distribution of the sample means or sampling distribution of mean heights...
It's basically the standard deviation of sample distribution of sample mean and its different from standard deviation of the sample
the standard deviation of the sampling distribution aka standard error is the 'spread' of the statistic of concern (usually the mean). the sample standard deviation is the spread of the individually observed values which the statistic is calculated from. the former is the latter divided by the square root of the sample size. an example - if you had a sample of billions of people, the standard error of the mean height would be tiny, because if you repeatedly selected a different sample of that size and averaged everyone's heights, you would not get a mean very different. However, the range of individual heights in your sample would probably vary greatly i.e. the sample standard deviation is large
Watch the video on the Central Limit Theorem.
No, you shouldn't use the t if you know sigma. The t is based on the fact that the denominator is random, not fixed. But your question is a good one, because what if your parent population was far from normal, you had a sample of size n=6 but you knew sigma. The sampling distribution for n=6 wouldn't be very normal so to use a z would be fatal. Using a t wouldn't be correct either because you know sigma. In this case a larger sample size would be the best way out.
Great video,thank you!
Thank you!!!!! U explained something I was struggling to get!!
Hmm, I'll try applying what I've learned. That should be warning enough that I'm not necessarily correct.
Sigma isn't given, only Xbar. However, since the sample is large enough to be covered by the CLT (>30), then a z-distribution is still valid.
I think.
Just like palui, bigger sample size is usually better. Otherwise, you can approximate a distribution for the sample, which is much harder and requires more skills and experience.
Thank you very much
Now i understood the difference between z statistic and t statistic.
Phenomenal work!!!
Very clear and helpful. Thank you
Whats the difference between S and Sigma x ? aren't both standard deviation of our sample?
Makes so much sense!!! Thank you so much!!!
Very nice, great addition, gives clarity to my study
Thank you
Or, just use a t-table for everything, using the proper degrees of freedom (you'll likely be using a calculator, anyway, rather than an actual table).
The symbols on the numerator of the z-statistic should be the reverse, right? It is mean of sample distribution divided minus the expected mean (x bar).
Sal, you are a hero.
You're really awesome ❤️❤️
You yourself say that you need to invoke the CLT to state that the distribution of the sample mean follows a normal distribution. Then you are invoking the CLT indirectly to t show that the test statistic is (x-bar - mu)/(sigma/sqrt(n))??
2:30, "the mean of the sampling distribution of the sample means " , in some other cases is just the "mean of population", right ?
bump
it is the same since they assume the two would be the same
They essentially "not exact" equate to eachother when get many multiple of large samples n>30 that create normal distribution that is not biased or skewed left or right.
I like this guy, i like this guy, his presentations are excellent, excellent. i do wish that he could break the habit, break the habit, of such FREQUENT repetition -- surely he is unaware of how distracting this is, It becomes, becomes annoying.....
Good video. SAS, short and simple
Can you do a video of assumptions and requirements of the t test?
Thanks a lot, very helpful tutorial
nice video! so we use the z statistic when the sample size is more than 30 and we use the t-statistic when sample size is less than 30? Is the sample size the only determining factor?
Great explanation thank you very much!
I thought the use of z statistic or t statistic depended on whether or not sigma is known; z statistic is used when the population parameters (mean and sigma) are known while t statistic is used when the sigma or/and population mean is unknown. (Sample characteristics only matter in determining what type of t statistics you are using)
Not necessarily... if you're estimating the population proportion from an experiment (binomial) you can estimate the Population Standard Deviation based on the results of the experiment -> SD/SQRT(n) -> use Z.
good video
This video re-enforces a common misconception about the use of the t-distribution. The sample size actually has nothing to do with it. It's true that if you're a student taking a basic stat class that a small sample size indicates that your teacher expects you to "pick" the t; however, that reasons is flat out wrong and in real life this will get you in trouble. Likewise, if your n is over 30, but sigma is unknown, it is still the t and not z (although, this is a much less severe mistake).
@angelusp777
Use Student's t distribution. If you compare the critical values from a t distribution to those from the z, you will see that they are not identical until n gets relatively large.
As N approaches infinity, t approaches Z
Thank you. Please I would like a reference for your talk on the sample that is more than 30 which follow a normal distribution
shouldn't we use t-test when population variance is unknown, no matter the sample size?
Very helpful seriously. Best method to give a quick revision and learn basic concepts.
Thankyou khan academy
thank you for interpreting the difference between Z and T table. the statistic book I read did not make the difference.
I got to gve to you man ……………….tt was straight forward and the best explntn …..nax a lot
So the formulas are the exact same, but you use a different table (z table or t table) depending on the test, correct?
If you know the sigma and the sample is bigger than 30 use the z score
A sample of 43 students from the agriculture faculty take a Scholastic Aptitude Test the sample has a mean of 520 and a standard deviation of 8. Construct a 95% confidence interval that contains the true population parameter.
would you use the t-distribution for this question? thanks!
Z
Can some say. What is the difference between signa x bar and s?
Ty
you need infinite degrees of freedom for t- distribution to be normal, one cannot say if greater than 30 then it will be normal because at 100 degrees of freedom also t-distribution is less than normal . For normal you get value of 1.96 for which infinte degrees of freedom is required not only greater than 30 . This is wrong
Thank you!
Then explain it clearly so that everyone can understand.
You are great
I don't understand what the difference is between the Standard Deviation of the Sampling distribution (1:43) and the Sample standard deviation 's' (3:48) ?? Anyone?
Watch the video on the Central Limit Theorem.
Can anyone please help me in avoiding the confusion from a simple problem in statistics which is haunting me from few days...
Q). A sample survey of tax payers belonging to business class and professional class yielded the following results:
Business class
Professional class
Sample size
n1 = 400
n1 = 420
Defaulters in tax payment
x1 = 80
x1 = 65
Test the hypothesis at α = 0.01 level of significance that defaulter rate is the same for the two classes of tax-payers.
I will be thankful to you if you can help me coming out of this in identifying null and alternate hypothesis and standard deviation.
2:34 "divided by population mean" i think u meant "population standard deviation"
Thanks lot
dam it i fell asleep. Stats is sooooo boring!!
ikrrrr!!
@@jiey5375 das my pp lol
So we don't know the standard deviation of the sample or the population. Then we estimate the population standard deviation using the sample standard deviation which we don't have??
nice, but too fast at times, and the jargon becomes hard to follow, maybe easier to do if explained backwards
thanx man
At n=40, the critical value for t at a 0.05 significance level is 2.023. For the z, it is 1.96. Using 1.96, when it is really 2.023 is voluntarily introducing a 3% error. I would hardly call that a good approximation. If you think it is, can I borrow $2023 dollars from you today, and repay you $1960 tomorrow? By your logic, they are approximately the same.
what is the sample size is less than 30 but it is still normally distributed? what should we use?