Thanks and glad it helped. On a side note, TH-cam has cancelled my partner program since I don't have enough subscribers (currently around 738, should be 1000 😞), so it would really help if you subscribe (and all your friends) 😉
well spotted and good question. The Mann-Whitney U test for 'larger' samples is usually done by using this normal approximation ( en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test ), an exact test is also possible, but is a lot of work to do especially with Excel. One approach for stata can be found in the article: journals.sagepub.com/doi/pdf/10.1177/1536867X1301300208. There are also tables for the Mann-Whiney U test (see for example www.real-statistics.com/statistics-tables/mann-whitney-table/). Hope this helps.
@@dintababy2018 youtube made it automatically into a link and included the ")" at the end. That indeed doesn't work. Fixed it. Thanks for letting me know.
I am struggling with Bonferroni correction... I read that "The Bonferroni correction is an adjustment made to P values when several dependent or independent statistical tests are being performed simultaneously on a single data set". Please explain what is the meaning of "single data set" there....second thing can you tell me clear assumptions of Bonferroni's correction, please...
for a Mann-Whitney U test, I don't think you need a Bonferroni correction (BC). The BC is used if you do multiple tests on the same data. For example if you do a one-way ANOVA to determine if there are differences in the mean score between categories. If there are, you might want to know exactly which categories are different. You then do a test to compare category A with category B, a test to compare category A with category C, etc. Each test you usually set that if the p-value is below 0.05 (5%) you reject the null hypothesis, but this means that for each test you have a 5% risk of rejecting the null hypothesis even though it is actually true. So with all these pairwise tests there is actually a pretty good chance that we make at least once a wrong decision. The Bonferroni correction is one method to try to compensate for this risk.
Very nice video. What puzzles me is step 8 - where is the "12" in the fraction coming from? Everything else is somehow connected with the tested data except for that one. What am I missing here?
Usually the criteria to accept or reject is set a 0.05. So if it is below 0.05 usually the null hypothesis is rejected. There is some controversy on why 0.05 is used as the threshold, but it is the threshold most commonly used. In some cases 0.01 is used, and sometimes 0.10, but by far the most frequently threshold is 0.05. Hope this was what you were looking for.
Why we using Normal distribution during P value calculation. Is this good approximation bcz as I using same method for my data but If I put FALSE in formula then P value coming something else ....Please
The 'TRUE/FALSE' in the last formula is not to use normal approximation or not, but for the function to return either the cumulative or simply the height of the normal curve. We are interested in the cumulative density function, so will have to use TRUE. As to the use of the approximation instead of the exact test, various rule of thumb exist. One can be found on docs.juliahub.com/HypothesisTests/MUGEl/0.10.1/nonparametric/ The exact test is a lot trickier with Excel, since Excel does not have the exact distribution for it.
@@stikpet Thanks a lot Sir, I am reading a paper and I have a doubt in paragraph below ....."For the permutation tests, we used the Asymptotic General Symmetry Test that paired results from the same individual with and without the window condition (i.e., repeated measure) and then analyzed the difference between the conditions for each individual. To assess if the potential moderator variables (Section 3.5.5) influenced the effect of the window vs. windowless experimental conditions, we used the Asymptotic General Independence Test, which treats the moderator variable groups (e.g., male versus female) as independent samples, and tests if there is a difference in the effect of the experimental conditions between the moderator variable groups. I will be grateful to you if you could explain what is the meaning of the "Asymptotic General Independence Test" is he talking about the Wilcox rank sum test .... Paper link if needed www.sciencedirect.com/science/article/pii/S0360132320301372
@@stikpet Ok Sir, But if possible please tell me when we should consider exact distribution and when approximate distribution in Mann Whitney test, please
@@shortandsweet2767 as mentioned befor different authors have different criteria. One of them can simply be found in the Mann-Whitney U section on docs.juliahub.com/HypothesisTests/MUGEl/0.10.1/nonparametric/
If we are accepting or rejecting the null hypothesis on the basis of the value finded in step 12, then value in step 11 stands for what ie, the z value? Why that value is not considered for accepting or rejecting the null hypothesis.? Hope u will rply
Uhm the z-value is considered. The formula is using cell M60 (although the orange still has M57, but you can see at 10:05 I'm actually using M60). The idea in testing is as follows: 1) We make a null hypothesis, which is a claim about the population 2) Determine a result from a sample (in this case a z-value) 3) Determine the chance of such a result as in the sample, or an even more extreme result, if the null hypothesis were true, this is the p-value/significance. In this case the chance of a z-value of 2.84 or higher (or -2.84 or lower). 4) If this chance is very low (usual considered low below .05) then we have a very low chance of this kind of result in a sample, if the null hypothesis were true, which indicates the null hypothesis is probably not true. This is always a bit tricky to understand, but read it carefully and it should make sense.
@@stikpet third and fourth steps are not clear. The z value is calculated from the sample. Then what is the logic behind checking the chance of such z value or a value higher than it on the sample?
@@dintababy2018 Statisticians have proven that if you would take an infinite amount of samples from any population they will form a normal distribution with at the center the population mean. So if we have one z-value and know the distribution we can also determine how rare it is to have such a sample or even a rarer sample. Perhaps my webpage on significance might be good to go over: peterstatistics.com/CrashCourse/1-Fundamentals/4-Significance.html and my video on explaining significance: th-cam.com/video/Ttbid6GoFHc/w-d-xo.html
Indeed significance = p-value. I'm not sure what you mean with 'my calculator' also since I don't think I calculate a confidence interval anywhere. Would not really know what to calculate a confidence interval about in a Mann-Whitney U test. Hope this still helps.
Excuse the language phrases. I usually call spreadsheets as calculators. The p-value to accept or reject a null hypothesis is set on a confidence value of alpha.
@@aymanraouf8010 Ah, that makes sense :-) The confidence level (alpha) is usually set at .05 (although there is some debate about this), so anything less is then considered significant.
Almost, but it is the other way around. The null hypothesis is rejected if the significance is below 0.05. If it is above, then we have insufficient evidence to reject it. The null hypothesis is usually that things are equal, or that there is no association. A bit more info on my website at peterstatistics.com/CrashCourse/1-Fundamentals/4-Significance.html Hope this helps.
too good, very easy to understand
Thanks and glad it helped.
On a side note, TH-cam has cancelled my partner program since I don't have enough subscribers (currently around 738, should be 1000 😞), so it would really help if you subscribe (and all your friends) 😉
I did not understand that last step. Why do you use the norm.s.dist function if the Mann-Whitney U test is a non parametric test? Thank you
well spotted and good question. The Mann-Whitney U test for 'larger' samples is usually done by using this normal approximation ( en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test ), an exact test is also possible, but is a lot of work to do especially with Excel. One approach for stata can be found in the article: journals.sagepub.com/doi/pdf/10.1177/1536867X1301300208. There are also tables for the Mann-Whiney U test (see for example www.real-statistics.com/statistics-tables/mann-whitney-table/). Hope this helps.
@@stikpet first link is not open
@@dintababy2018 youtube made it automatically into a link and included the ")" at the end. That indeed doesn't work. Fixed it. Thanks for letting me know.
Thank you for the video, would it be possible to put a link to the excel file?
Just added it for you. Link in the description.
I am struggling with Bonferroni correction... I read that "The Bonferroni correction is an adjustment made to P values when several dependent or independent statistical tests are being performed simultaneously on a single data set". Please explain what is the meaning of "single data set" there....second thing can you tell me clear assumptions of Bonferroni's correction, please...
for a Mann-Whitney U test, I don't think you need a Bonferroni correction (BC). The BC is used if you do multiple tests on the same data. For example if you do a one-way ANOVA to determine if there are differences in the mean score between categories. If there are, you might want to know exactly which categories are different. You then do a test to compare category A with category B, a test to compare category A with category C, etc. Each test you usually set that if the p-value is below 0.05 (5%) you reject the null hypothesis, but this means that for each test you have a 5% risk of rejecting the null hypothesis even though it is actually true. So with all these pairwise tests there is actually a pretty good chance that we make at least once a wrong decision. The Bonferroni correction is one method to try to compensate for this risk.
Very nice video. What puzzles me is step 8 - where is the "12" in the fraction coming from? Everything else is somehow connected with the tested data except for that one. What am I missing here?
it is indeed not connected to the tested data, but always 12. It is also in the SE formula.
how do I get the U value? very helpful thanks
step 5 is determining the U value...
@@stikpet thank you so much and I appreciate the fast reply, I am using this to double check the results I got from other software for my thesis!
If the calculated value in step 12 is the p value how can we reject or accept null hypothesis pls rply
Usually the criteria to accept or reject is set a 0.05. So if it is below 0.05 usually the null hypothesis is rejected. There is some controversy on why 0.05 is used as the threshold, but it is the threshold most commonly used. In some cases 0.01 is used, and sometimes 0.10, but by far the most frequently threshold is 0.05. Hope this was what you were looking for.
Thank you so much this was super helpful
Glad it helped!
thank you! so which cell is the U value?
step 5 is determining the U value...
Very Nice video..Thanks a lot Sir
Why we using Normal distribution during P value calculation. Is this good approximation bcz as I using same method for my data but If I put FALSE in formula then P value coming something else ....Please
The 'TRUE/FALSE' in the last formula is not to use normal approximation or not, but for the function to return either the cumulative or simply the height of the normal curve. We are interested in the cumulative density function, so will have to use TRUE.
As to the use of the approximation instead of the exact test, various rule of thumb exist. One can be found on docs.juliahub.com/HypothesisTests/MUGEl/0.10.1/nonparametric/
The exact test is a lot trickier with Excel, since Excel does not have the exact distribution for it.
@@stikpet Thanks a lot Sir, I am reading a paper and I have a doubt in paragraph below ....."For the permutation tests, we
used the Asymptotic General Symmetry Test that paired results from the same individual with and without the window condition (i.e., repeated measure) and then analyzed the difference between the conditions for each individual. To assess if the potential moderator variables (Section 3.5.5) influenced the effect of the window vs. windowless experimental
conditions, we used the Asymptotic General Independence Test, which treats the moderator variable groups (e.g., male versus female) as independent samples, and tests if there is a difference in the effect of the experimental conditions between the moderator variable groups.
I will be grateful to you if you could explain what is the meaning of the "Asymptotic General Independence Test" is he talking about the Wilcox rank sum test .... Paper link if needed www.sciencedirect.com/science/article/pii/S0360132320301372
@@shortandsweet2767 sorry, I wouldn't know. Perhaps you can email the author of the article. The email adress is in the link you shared.
@@stikpet Ok Sir, But if possible please tell me when we should consider exact distribution and when approximate distribution in Mann Whitney test, please
@@shortandsweet2767 as mentioned befor different authors have different criteria. One of them can simply be found in the Mann-Whitney U section on docs.juliahub.com/HypothesisTests/MUGEl/0.10.1/nonparametric/
I am getting a negative number in step 5 as my r1 is lower than n1(n1+1)/2. What should i do?
which step is then not working? Not clear where a negative R1 would cause a problem...
@@stikpet sorry had made an error while computing the formula. Thanks for your response :)
Sir, your UDF not working in your downloaded file...It is showing #NAME?...please help
If you see #NAME at a UDF it usually indicates you have not enabled macros. If you enable the macros it should work.
If we are accepting or rejecting the null hypothesis on the basis of the value finded in step 12, then value in step 11 stands for what ie, the z value? Why that value is not considered for accepting or rejecting the null hypothesis.? Hope u will rply
Uhm the z-value is considered. The formula is using cell M60 (although the orange still has M57, but you can see at 10:05 I'm actually using M60). The idea in testing is as follows:
1) We make a null hypothesis, which is a claim about the population
2) Determine a result from a sample (in this case a z-value)
3) Determine the chance of such a result as in the sample, or an even more extreme result, if the null hypothesis were true, this is the p-value/significance. In this case the chance of a z-value of 2.84 or higher (or -2.84 or lower).
4) If this chance is very low (usual considered low below .05) then we have a very low chance of this kind of result in a sample, if the null hypothesis were true, which indicates the null hypothesis is probably not true.
This is always a bit tricky to understand, but read it carefully and it should make sense.
@@stikpet third and fourth steps are not clear. The z value is calculated from the sample. Then what is the logic behind checking the chance of such z value or a value higher than it on the sample?
@@dintababy2018 Statisticians have proven that if you would take an infinite amount of samples from any population they will form a normal distribution with at the center the population mean. So if we have one z-value and know the distribution we can also determine how rare it is to have such a sample or even a rarer sample. Perhaps my webpage on significance might be good to go over: peterstatistics.com/CrashCourse/1-Fundamentals/4-Significance.html and my video on explaining significance: th-cam.com/video/Ttbid6GoFHc/w-d-xo.html
Is step 12 the p-value?
What is the confidence interval in your calculator?
Indeed significance = p-value.
I'm not sure what you mean with 'my calculator' also since I don't think I calculate a confidence interval anywhere. Would not really know what to calculate a confidence interval about in a Mann-Whitney U test.
Hope this still helps.
Excuse the language phrases. I usually call spreadsheets as calculators. The p-value to accept or reject a null hypothesis is set on a confidence value of alpha.
@@aymanraouf8010 Ah, that makes sense :-)
The confidence level (alpha) is usually set at .05 (although there is some debate about this), so anything less is then considered significant.
So a value of more than 0.05 in Step 12 will mean that I will reject the Null Hypothesis?
Almost, but it is the other way around. The null hypothesis is rejected if the significance is below 0.05. If it is above, then we have insufficient evidence to reject it. The null hypothesis is usually that things are equal, or that there is no association. A bit more info on my website at peterstatistics.com/CrashCourse/1-Fundamentals/4-Significance.html
Hope this helps.
so the answer is significance because you got 0.004 which mean less than 0.05. Am I right, sir?
there is indeed a (statistical) significant result, the male students thought significantly different than the female ones.
Really helpful