That mindset apply for other academic/professional areas. I also tell this to my students: “it's not about memorizing everything, it's about knowing how to find the answers when you are faced with a problem”.
Stats teacher here, I just found your channel and I love it. Years ago I was at a conference, in a mini-session on teaching p-values that turned into something more like a group therapy commiseration over the near impossibility of getting students to understand p-values and hypothesis testing. One participant admitted that he was to the point where he'd tell his class "if you remember nothing else from this, just remember that p < 0.05 means it's significant". I shared, in response, that I tell my students "if all you remember is p < 0.05 means it's significant, I'd prefer you forget that too". Of course, hardly anyone understands p-values and hypothesis testing because they make little intuitive sense, and most textbooks present a mashup up two competing approaches that should have never been mashed up together (binary decision making procedure vs.quantifying strength of evidence against a hypothesized model). We're effectively teaching students to recite a magical incantation, written in a langauge they don't speak. Not a great use of anyone's time. I agree with your cooking analogy. There's never a "correct" way analyze data, just as there's no "correct" way to cook a meal. There are going to be lots of good options, along with limitless bad options, and getting better at statistics means getting better at finding a good option or at least avoiding a bad one. This is necessarily subjective, despite looking like "math", because we're mathemetizing value judgments about what we think is interesting and then asking the real world to provide us a collection of numbers we can use to approximate a solution. It's all very messy, but with any luck we end up properly satiated in the end. Anyway, keep fighting the good fight!
They'll come around soon enough. A couple more seismic failures in statistical analysis predicting political and economic outcomes and people will wake the f up and try to educate themselves.
Thank you for this wonderful video, Dustin! I suffer from the imposter syndrome in grad school and I have those lightbulb moments... Their frequency is increasing in the final 3 classes of my program and I always feel like that I'm "late to the game." I really needed to hear this.
I'll tell you something one of my mentors (Robert Terry) said to me. After I said to him, "I don't feel like I'll know anything by the time I graduate," he said, "Oh you won't. That's okay. Nobody knows anything when they graduate. But you'll get there afterward."
I think the issue you are describing is the result of students being taught to the test from elementary school. This isn't specific to stats, it's become endemic to education. I've had students ask me to tell them exactly what they need to know and how to do it so they can pass my class and promptly forget the material so they can cram what they need to pass the next class into their head long enough to pass that class's tests. I chimed in on that thread you referenced. Hearing your discussion here makes sense because I was hooked on data and data analysis and once I had an understanding of some of the basics, I needed to take some classes and ones I needed weren't being offered, so I took calculus and linear algebra and a bunch of the stuff I did in those earlier stats classes made so much more sense once I had a better grasp of the mechanics of why they worked. I am definitely in camp calculus and linear algebra, but after hearing this and thinking on how I came at calculus and linear algebra, I was in the perfect position to absorb that information and fit it into my own growing data analysis nomological network.
"Teaching to the test" has problems, but not having assessments that can be applied evenly and relatively inexpensively is a major limitation to the process of educating the general population. Either eliminating tests or making them the sole measure of an ongoing learning success or a predictor of future career suitability will not address the underlying societal disconnects.
I'm the person Dr. Fife calls out for saying 'the path to being a statistician starts with linear algebra and calculus' -- I agree that with how Dr. Fife laid out the steps to understanding statistics. I would never tell someone learning to drive a car to learn theoretical physics. What I was trying to convey is that many pre-requisites for (pure) Stats majors is LA and Calculus and if you don't have that background, you won't get into a (pure) Stats program, let alone survive it. When I first started learning Stats (on my own--school was very cook booky), I actually went through the approach Dr. Fife mentioned here which was understanding the why and how first before digging into the math. Simulations also helped (and there's even math stats books that use simulations + math to build intuition). And the process isn't even linear as well. I may start off with the why and how first, go to the math, and then go back into the why and how further, go further back into the math, etc. I somewhat disagree with the hand calculations part--it really depends on the exact calculations one is doing. You probably don't need to hand calculate every single thing, but some hand-calculation exercises do show you how different statistical concepts are related. For anything above that, using a computer to do the math, run simulations, etc., is where it's going to be better. The other thing I'll say is that (maybe if you don't need the math), is that learning the math can help you understanding the concepts better because it's another connection in your head. The more connections you make, the better your understanding. I don't make new methods on my own, nor do I feel the need to. But when I'm read a new methods paper, how can I understand what's going on if I don't have that background?
Hand calculations and mental arithmetic on summarised numbers is great for getting a 'feel' for the problem (I'm slide rule old;-). The inability to see the scale and quantification of problems is a major issue (the old news headline about millions, billions and trillions whooshing straight over folks heads who haven't or can't make it 'real' e.g. £1 per person per year, vs £1 trillion national cost over 15 years programme!).
As someone around your age that only had a few courses of stats, economics, calc, linear algebra / modeling, conceptual physics, etc. this video was relatable; however, I always viewed undergrad courses as the "demo" version of games -- which I grew out of after college. You get a sense of what the subject is capable of with some neat tricks, before you decide if it's right for you. The "google point" you speak of is what I refer to as exposure. When you come out of school, you might not ever use a formula for amortization, but you'll know it exists and you did something with it in a finance class. You've reached that "google point" of knowing what to look for, but not knowing how to do it. For me, "google" is a $200 textbook sitting on my shelf and remembering about which chapter we covered it in our course. I enjoyed the various levels of Statistics classes I took. I wish I was more versed and retained as much as my grades said I knew, especially with age-old stats concept of "machine learning" being all the rage these days. The few examples you gave in these videos were great -- love the practical application that extends beyond theory! I want both -- a cliff's notes cookbook and a theory book.
Nice to hear what you said about recipe's, i recently came across them and thought i was getting left behind because i wasn't using them, but i did think similar to you
In his autobiography, Feynman talked about very similar frustrations and insights with undergrad physics students in Brazil. I think it has a lot to do with the emphasis on high-stakes performance evaluations.
spot on. the consequences of having assessments dictate marks and frankly job prospects, status, money etc. sure this form of education system broadly implemented has done great things, but this is one of its major limitations
"The mathematics are not there for the joy of the analyst but because they are essential to the solution." Karl Pearson, Notes on the History of Correlation (1920).
Current biostats TA (and resident person on the teaching team who actually likes math and stats and code) here, thank you so much for this. It's extremely validating to hear that people are putting into practice my ideal approach for teaching the subject as someone who loves it and has to be part of it getting taught in a way I think sucks. Most of the quants in my department have been campaigning for YEARS to make biostats a requirement to begin with, and, now that we finally have that, I desperately hope we'll take the next step to require taking R beforehand.
Professor, I can sense the frustration in your words. It feels like you're ahead of the curve, much like many thought leaders throughout history. From my perspective, you might be shaking up the status quo, which can make some within the profession feel insecure and uncomfortable. I hope you find a way to bridge the gap in understanding without bruising too many egos, as we've seen in the past how people like you can get blocked or crushed by the fragile majority protecting their interests. I truly appreciate your content, time, and effort.
I really appreciate this perspective especially from a statistics professor. I am interested in the philosophical reasoning behind statistical methods and find the many subdisciplines in math like biostats disregard assumptions, or use the wrong tests only because it was covered in their courses (of which I am guilty), analogous to cookbook style approach to intro stat (that I learned from too!). I am also still a student so maybe I am biased! love your work.
Oddly, that was my experience too when I worked as a biostatistician; most of the fellow biostatisticians were really cookbook-ey. It really annoyed me because my degree wasn't in biostats wasn't as appealing as a "pure" biostats degree.
Why requires true in depth understanding. How is just the mechanical approach. How is easier to teach. The school system is mostly based on memorized info and how to apply it.
The car question is more econometics question (or Ops Research-ish Linear Programming problem where the objective function can be customized by the buyer. So good point.
Data Envelopment Analysis, in which you can define input as cost and choose your own outputs (mpg, fun factor, etc.) Then calc the efficient frontier in potentially higher dimensions and find cars on that frontier. DEA is highly under utilized imo.
I was a TA, and we had no choice about what to cover or how to cover it. All sections had to proceed in regimented order so all sections were at the same point at the same time, covering the same material. I don't blame them for trying to control the presentation. There were a lot of students and a few inexperienced teachers, like me. But, your approach is outstanding! By the way, new subscriber here.
First video of yours I watch. Very good content. I recently enrolled in a very stats-intensive program. I will take your advise not only as knowledge, but wise words on how to approach knowledge better. However, I can't help but notice that you never explained how to calculate the central limit theorem 😆
9:30 That example of building a model with features of a car to predict price is actually a fairly common 'cookbook' problem in any regression textbook. Except I have seen it more in terms of predicting house price given area , number of bedrooms and number of bathrooms
@ I did Actuarial and Applied Maths. We did not heavily focus on specific applications. I would still think that I could pick up any statistics book and find a similar example in the multiple regression section (even in “stats for social sciences” which is not as rigorous as a full stem stats class).
I agree! A Gaussian is a Poisson is a binomial, etc. Oh the arguments! Today's example was estimating the 'mean' (arithmetic? average) for a wordle score, given (or not given) what to do about the 'failed' (6 guesses or more) count part of the stats and hence whether to assume some, as yet unknown, long tail. But really, what does it all mean (aka understanding), and is 3.8 actually better than 3.95 guesses? Maybe I should of done a KL measure of the differences.
I feel ya! Every time I raise the point with my students and postdocs that they are doing it wrong and I hear "but everybody else does it this way!" I want to quit :) I often feel like I will be referred to HR for "statistical harassment" ;)
Cool video! As a Statistician I approve this message but calculating by hand (e.g., Satterwaite's df approx) helps me personally know the assumptions needed (and not just memorization). Though, I've learned through my education (just at an undergrad level though) and career the most important concept I've seen personally is interpretation. What are the needs of the SME? What are my assumptions? (some methods use algorithms/formulas that NEED certain assumptions to be met) Ps - sorry for run-on sentences and incoherent written thoughts, I'm no English major haha
I have used Satterthwaite's approximation to the degrees of freedom for a two sample t-test without the equal variance assumption many times (in exercises). I do not see the value you give it. It is included in AP statistics with no derivation. Its inclusion teaches people the incorrect idea that the exact number of degrees of freedom is important in a t-test, where it is much less important than in a chi-squared test or F-test since those distributions don't converge as the degrees of freedom increase. What am I missing about Satterthwaite's formula?
@douglaszare1215 I'm not sure what you mean by not being valuable. Satherwaites approx is an attempt to get a better estimate of df when you dont have equal variances. Kind of like a weighted average in a way. Personally, it helps me understand cases where you have a linear combination of expected means squares and is helpful (at least to me) in the context of just trying to understand what's happening in mixed models (PROC MIXED in SAS). Also undestanding that things get tricky with small data sizes, unequal variances, and/or complex R side covariance structures. If I remember correctly, it's recommended to use the updated Kenword-Roger method (DF= KR2). So studying the formulas and how/why the formulas are the way they are helps me when attempting to understand the limitations and not just pure memorization. All that to say, it has helped me personally understand what's going on broadly speaking when problems arise. For example, I've come across situations where the df are so low, my kfactors are inflated. So that tells me to look at the individual contributors and make a judgement call. Is the population really like this or is it due to the limitations of my sample.
In retrospect, maybe I should have said that hand calculations might be good *after* you have a strong foundational knowledge (much as I said knowing the math is good after you know how to do it with software). Also I love that you talk about interpretation. Yes! That's what's most important!
Happened to me recently during research. Sure, I can apply tests to data (non-parametric, parametric, testing for normality, you name it), but now I had to build a simulation using probabilistic neuron codes and decode the information in a circuit. Now, this implies constructing tailor-made estimators, a skill which depends on a serious understanding of likelihood functions and yeah, I of course was never taught such a thing... So I had to self-learn as fast as possible, but of course my understanding is still restricted and not global.
This is very true of math in general. My bachelors and masters are both in math and I had a handful of topics I felt I really learned well and for the rest it was too detail/procedure oriented. This is true for how math is taught at all levels.
I think that's why I was always bad at math in high school. Because I'm a nerd, I read a book about something like the history of mathematics. It showed why Newton wanted to understand limits and how limits relate to derivatives. It blew my mind and made calc so much easier to understand.
4:37 “Statistics courses are ‘cookbookish’…” I think lots of courses, and especially textbooks, are cookbookish in many ways. Here’s a way to address that in teaching: Have students work in groups but don’t give them the ‘cookbooks’; instead, have them develop and write the ‘recipes’ to go into the ‘cookbooks’ themselves. This requires a willingness to facilitate many groups without just giving the answers to any one group.
As with so many things, understanding the why is so much more important than understanding the what. In my AI/ML course I teach, we spend time visualizing and analyzing our data before we feed it to our models. What are the statistical models trying to tell us? which methodology is best and why? If you know the why, then you can solve your own problems. With the internet, what becomes a simple lookup. I agree that you need to know the steps, but route memory adds no value other than speed. Speed has little value if you have problems with your data you need to solve.
Welp TH-cam algorithm recommended me this. I started learning statistics since it's relevant for my job since I'm learning on the job to being a bioinformatician. I started out writing out basic very simple stats and beyond that i just use the programming languages I'm familiar with once I felt i had a good grasp of the concepts.
I like this video and agree with most of it. I think introductory statistics classes at top schools typically are not as bad as you suggest. There are a few bad ones, but most spend quite some time discussing things other than procedures for statistical tests.
It makes me wonder what stats textbooks they're using. Most of the intro stats books do take this approach. (Mine and one other are the only exceptions I know of).
I did some reading and the knowledge necessary to understand statistics completely is a lot. (Calculus 1,2,3 Linear Algebra and obviously a book about Stochastic.) So understanding statistics might be a lot more work than most people are willing or able to do. Me included. It would take me years and then I would forget most of it.
Wenn man folgende Unterscheidung macht, können viele Probleme entschärft werden: Vordergrund und Hintergrund. Um ein Bild zu verstehen, benötigt man beides.
Mathematical probability & statistics is on another level. The problem is that you can’t expect undergraduates to have all the foundational mathematics to handle the multi-terms courses us graduate statisticians take. It’s an unreasonable ask.
@@tmann986 I would say you might appreciate it more if they relate probability and statistics to engineering problems. That is important because you can see where the rubber meets the road and that might motivate you to learn it. My undergraduate courses were so abstract at the time because I couldn’t really relate the application of probability and statistics to anything I was interested in. I relied on Excel to do all my calculations. I definitely wasn’t integrating Gamma functions for fun. My later advanced undergraduate statistics courses applied statistics to modeling business problems which was something I could relate to. My graduate courses in my statistics degree were kind of all over the place in their applications. There’s no way around it. Probability and statistics takes years to master. Probability theory on its own is a beast to master at the graduate level.
@@jbj926 that’s a good point to relate the probability and statistical models to engineering problems. I’m sure it’s worth will come into significance if I so choose to go into graduate school for engineering. I do have a very abstract mind meaning I think I’m the only engineering student who likes writing proofs. I do like math for maths sake and I’m sure I would think the same for statistics and probability.
Great video. I have not the same experience when teaching statistics as you describe though. But I think it depends who you are teaching it to. When teaching Mathematical Statistics for students doing Master of Science in Engineering och Computer Science this is not a problem because it's simply not taught in the "cookbook way" at all because we can expect the students to be prepared to do various creative logical thinking when faced with a problem. The whats and whys are crucial from the beginning there. This problem you describe I find to be more prevalent in courses for statistics in educations that have less weight in mathematics and logic. Courses that are thrown in rather late in an educations like these to "prepare" student to be able to make their own statistical conclusions from papers they are about to write. I would argue the amount of misconceptions that arises from people incorrectly assuming they know statistics well from such a course is such big problem that it would probably be better if they weren't taught statistics at all. Instead they could just have someone else guiding them through the process the very few times they actually need to use it in their education. Then if they actually need to use statistics a lot later on they should actually learn statistics more thoroughly. I think much of this problem arises from the weird but common misconception that the processes in which you participate in when learning mathematics and logic is "something I will never use". This is often caused by many missing the point that it is not the procedure you need to learn that is the most important in mathematics but the very ability to reason and treat complicated problems in a structural and logical manner to infer conclusions you otherwise would have no way of inferring just by "gut feeling" and that outright seemed impossible to solve at a first glance. Mathematics and problem solving in general helps you build up perseverance, resilience, patience, optimism and the habit of addressing a complicated problem by breaking it down to more but easier problems. These are traits that is probably more important now than ever and many seem to have little or none. Since so many have never tried to view mathematics in this way, they simply haven't developed these abilities enough and are used to a situation where if they cannot just see what they must do from the very start, they immediately assume they cannot solve the problem and just gives up. That's probably why we see all of these raised hands asking "Can you just tell us what we should do!".
Hey! I have a nominal categorical variable (say with 10 groups) and a continuous variable with data size 150. I am finding it difficult to calculate the relationship between the two. Can you please help suggesting any method to get a metric of association between the two variables ?
When you talked about how the collage had the population data i think the correct interpretation is ratter that each class is cohort of students that is a kind of a not random sample from people how can attend the collage, in that sense it's not the population data. In a way, similar to welding, each student is a sample from the process of education, you start with a sample then apply a treatment and then measure the outcomes with the relevant measures. One way i think would be interesting to explore such data would be to use high school performance as a priori to ability, and study how each class improves over the baseline.
If you ask me, I think the answer to the "riddle" ultimately comes out to simply describing the data. That's certainly the process we use for the Census, for example, and isn't that quite literally a study of the population?
I'd say it depends on what you want to actually do with this data. For instance, based on the population data from a particular school one can say with certainty that their average GPA is, for instance, 2.6. Or that the school has more men than women (presuming gender is set as binary cateorical variable rather than being a bimodal spectrum). You wouldn't really need a hypothesis to test. But if you want to figure out what happens when you introduce a new student in this school, or what this dataset can tell you about cohort of students the next year, then yeah, you'd treat it as a sample to infer trends.
I have felt this way about many topics being taught. A great problem. (Feedback: the music is highly distracting at times. Either too loud or too disruptive a composition.)
Hey Justin always love you content. So if I’m on the ‘consultant pathway’ does your course support me being able to make sense of data in R so I can help people make decision ‘mainly social science employee stuff’? I have mid level software background and have got the basics in R (your content has been amazing for that, and Hadley Wickham’s book…) but how much stats do I need, I love your concept of learn the why not the steps. This makes sense to me. ❤
Often, we only have convenience sampling-and not even a clear picture of the underlying population. As I see it, there is hardly any reason to conduct significance testing in such cases.
As seen from a math background, stats already appear like an obscure heap of recipes with a dab of superstition, a hybrid between cooking and alchemy. I suspect the cookbook approach plays a role in my perception. That said, cookbook teaching is pervasive. It sounds ideal from the teacher's point of view because you can tell the students: 'this is very simple, just memorize and follow those steps, anyone can do it, you don't need to be Ramanujan'. And then the teacher is baffled when the student gets everything mixed up and/or completely freezes at step 3, even though they can regurgitate step 3 by heart. My hunch is that humans have evolved to be problem solvers, not machines. Our brains switch off when we are asked to behave like machines, as if our brains had a built-in safeguard that prevents us from doing something if we don't understand why. The consequence is that this safety feature has to be overridden if we we want to learn to do something mechanically without understanding what goes on behind the scenes. It's feasible but it's extremely laborious (think long division).
LOVE this. Especially this part: "My hunch is that humans have evolved to be problem solvers, not machines. Our brains switch off when we are asked to behave like machines, as if our brains had a built-in safeguard that prevents us from doing something if we don't understand why"
At 8:10 in the video when you say that there's no need to test anything when we have information about the whole population - this is debatable. It is true that there's no need for inferential statistics to infer conclusions to the population. But one could argue that there still exist a bunch of questions that has to be answered with the use of statistical inference. For instance, are girls expected to perform better than males in a particular course? We do have the whole population so we can answer "did girls perform better...". But to answer the question whether girls can be said to be expected to perform better, we need statistical inference!
Nobody taught me statistics. Only thing I know is this garbage cookbook I have no choice but to use because people only tasted (understand) cookbook recipes and need meals would be met with confusion. Please let me cook, but I don't know how!
Tools of approximation where interactions are important. Put deterministic measure here pushes complexity there. Absorb the complexity along the in step by step intellectual anylitical parts. When cos/ sin A cutting speed + b feed rate fails for no reason it's still sigma 6 profitable you just scrap the parts play musical chairs of super position to get the answers you want blame hidden variables of many world atomized phase changes or messy molecular structures and bonds maybe thermodynamical environmental decay issues lol. What you won't do is send xyz manmade time hierarchy knowledge of good evil equations back to Academia to fix. They will re set new frame of reference trash can the problems.
Spot on! First, medicine. I was shocked that doctors perform diagnoses by memorizing STEPS! And assumed "brightness" is based on one's ability to memorize STEPS! Doctors rarely talk or listen to patients anymore; they run tests which follow STEPS, and often misdiagnose because their assumptions were wrong, so their tests were wrong, so the STEPS were wrong, so the outcome was wrong. Same thing with statistics! I've worked with Ivy League MD/PhD Biostatisticians for over 40 years, and I'm astounded at how little Stats they know; and I'm not a Statistician! Further, whenever I press these MD/PhD folks to explain why they're doing what they are doing, very few could give plausible reasons - BECAUSE THEY WERE FOLLOWING STEPS! And these STEPS in Clinical Trials rarely change so there's even less of a need to under WHY things are being done in a particular way. STEPS result in COOKIE-CUTTER approaches that are more suited to BAKERS. 😳 Education, in my humble view, is all screwed up. The focus of education has been sullied by simply trying to get a job instead of LEARNING. Learning means asking questions, lots of questions, but nobody wants to understand anymore; just show me how, just show me the STEPS so I can pass the exam and get a job and make money. There's another thing that I've never been able to digest when it comes to learning Mathematics and Statistics, and I promise I'll end here. It's the stupid names that are meaningless or misleading that have forever been assigned to things - like a P-VALUE! It took me 40 years to finally accept the term "p-value"! So, so, so very UN-INTUITIVE! In fact, it's borderline silly, and it gets in the way of learning. Okay, I'm done. Sorry. But STEPS and stupid terminology get to me.
Thanks for your kind and understanding reply. Briefly, I must first confess that I'm a poor student at Mathematics and Statistics; I mean really poor. I want to believe that it's because of how I was taught, but I don't want to shift the blame to someone else, at least not totally. What I do know is that when I teach myself, I can learn, and that's because I can take my own time to ask myself as many questions as possible, and not feel rushed about having to pass an upcoming exam. Sorry to digress, but now on to "p-value". In Clinical Trials, as you may already know, it's all about p-values, or at least that's how it appears to me (a non-statistician). In most cases in Clinical Trials, p-values are used when the data (drugs being tested) are blinded, and during which time Statistical Programs are being developed. Yet, the question keeps coming, non-stop, about p-values. As a non-statistician, I keep wanting to blurt "it doesn't matter, the data is blinded, p-values have no real meaning at this point," but I keep quiet because I'm a non-statistician, and I don't want to get fired. Literally. And truth be told, maybe I don't fully understand what is going on as a non-statistician, so another reason to hush. P-value means probability value, but as a non-statistician, the furthest I could get is that the meaning of probability is just like flipping a coin or rolling a die. And intuitively, the higher the number the better the probability; 3 out of 6 is better than 1 out of 6. So, as a non-statistician, why do we want to celebrate SMALL P-VALUES? In my head, small probability numbers mean we're doing really bad. But not so with p-values, the smaller the better. 😳 Took me ages, and lots of reading to finally get and accept the name: "p-value". Sorry to take so long, but there's another problem with naming something by ITS SAME NAME. It's like saying something like "a tree is a tree that's tall". It's repeating the term that's trying to be explained (the sky is the sky with the clouds). Honestly, this is my last bit of explanation. Calling something a "probability value" could apply to every single probability value that is calculated in statistics, at least to a non-statistician; with a die, 3 out of 6 is a probability value, with a coin, 1 out of 2 is a probability value, so just about everything in statistics is a probability value, so what's so special about a "p-values" that it should be given a special name and meaning? I don't consider myself right or wrong, it's the name that was chosen for a value that doesn't have any intuitive suggestion as to what it is. I have the same issue with calling something "percent change" when it's real name is "mean percent change". They're different... at least in my head as a non-statistician. And this blocks learning just like STEPS, STEPS, AND MORE STEPS.
Interesting! I didn't realize doctors had the same problem. And as for stupid names--Yes! The worst example I can think of is degrees of freedom. It's such an esoteric concept that is unnecessarily complicated.
@@QuantPsych Absolutely correct 💯🎯 as to degrees of freedom. To put a spin of humour on the seriousness of STEPS and poor choices to describe Statistical stuff, I'd say it's probably exactly what happened to that person you shared who said "WE HAVE THE POPULATION", hence no need for any type of statistical inferences as we know everything already. 😆 The word "POPULATION" has special meaning in Stats as compared to the general use of the term used by the general public. To elaborate for a second, the STUDENT POPULATION of a school has one meaning in every day life, but a very specific, and scientific meaning in Statistics. That person, a teacher of Stats for many years if memory serves me correctly, had a totally street understanding of POPULATION. 😆 Honestly, I wouldn't know if to laugh or cry were I to be faced with this declaration in a public setting. 😳 Stay well and keep teaching. We need you, and people like you. Literally. Cheers.
I don't think that the education does this in bad faith... I think that education, as it is now, try to get broad knowledge of everything but not very deep... But at least you get an answer
That mindset apply for other academic/professional areas. I also tell this to my students: “it's not about memorizing everything, it's about knowing how to find the answers when you are faced with a problem”.
exactly!
Here are the instructions:
Step 1) Figure it out.
Stats teacher here, I just found your channel and I love it. Years ago I was at a conference, in a mini-session on teaching p-values that turned into something more like a group therapy commiseration over the near impossibility of getting students to understand p-values and hypothesis testing. One participant admitted that he was to the point where he'd tell his class "if you remember nothing else from this, just remember that p < 0.05 means it's significant". I shared, in response, that I tell my students "if all you remember is p < 0.05 means it's significant, I'd prefer you forget that too".
Of course, hardly anyone understands p-values and hypothesis testing because they make little intuitive sense, and most textbooks present a mashup up two competing approaches that should have never been mashed up together (binary decision making procedure vs.quantifying strength of evidence against a hypothesized model). We're effectively teaching students to recite a magical incantation, written in a langauge they don't speak. Not a great use of anyone's time.
I agree with your cooking analogy. There's never a "correct" way analyze data, just as there's no "correct" way to cook a meal. There are going to be lots of good options, along with limitless bad options, and getting better at statistics means getting better at finding a good option or at least avoiding a bad one. This is necessarily subjective, despite looking like "math", because we're mathemetizing value judgments about what we think is interesting and then asking the real world to provide us a collection of numbers we can use to approximate a solution. It's all very messy, but with any luck we end up properly satiated in the end.
Anyway, keep fighting the good fight!
Can't believe this content only has 18.4k subscribers.
Or I take that back, considering the message, it actually makes sense.
They'll come around soon enough. A couple more seismic failures in statistical analysis predicting political and economic outcomes and people will wake the f up and try to educate themselves.
I'm laughing and crying at the same time :)
Thank you for this wonderful video, Dustin! I suffer from the imposter syndrome in grad school and I have those lightbulb moments... Their frequency is increasing in the final 3 classes of my program and I always feel like that I'm "late to the game." I really needed to hear this.
I'll tell you something one of my mentors (Robert Terry) said to me. After I said to him, "I don't feel like I'll know anything by the time I graduate," he said, "Oh you won't. That's okay. Nobody knows anything when they graduate. But you'll get there afterward."
I think the issue you are describing is the result of students being taught to the test from elementary school. This isn't specific to stats, it's become endemic to education. I've had students ask me to tell them exactly what they need to know and how to do it so they can pass my class and promptly forget the material so they can cram what they need to pass the next class into their head long enough to pass that class's tests.
I chimed in on that thread you referenced. Hearing your discussion here makes sense because I was hooked on data and data analysis and once I had an understanding of some of the basics, I needed to take some classes and ones I needed weren't being offered, so I took calculus and linear algebra and a bunch of the stuff I did in those earlier stats classes made so much more sense once I had a better grasp of the mechanics of why they worked. I am definitely in camp calculus and linear algebra, but after hearing this and thinking on how I came at calculus and linear algebra, I was in the perfect position to absorb that information and fit it into my own growing data analysis nomological network.
"Teaching to the test" has problems, but not having assessments that can be applied evenly and relatively inexpensively is a major limitation to the process of educating the general population. Either eliminating tests or making them the sole measure of an ongoing learning success or a predictor of future career suitability will not address the underlying societal disconnects.
I'm the person Dr. Fife calls out for saying 'the path to being a statistician starts with linear algebra and calculus' -- I agree that with how Dr. Fife laid out the steps to understanding statistics. I would never tell someone learning to drive a car to learn theoretical physics. What I was trying to convey is that many pre-requisites for (pure) Stats majors is LA and Calculus and if you don't have that background, you won't get into a (pure) Stats program, let alone survive it.
When I first started learning Stats (on my own--school was very cook booky), I actually went through the approach Dr. Fife mentioned here which was understanding the why and how first before digging into the math. Simulations also helped (and there's even math stats books that use simulations + math to build intuition). And the process isn't even linear as well. I may start off with the why and how first, go to the math, and then go back into the why and how further, go further back into the math, etc.
I somewhat disagree with the hand calculations part--it really depends on the exact calculations one is doing. You probably don't need to hand calculate every single thing, but some hand-calculation exercises do show you how different statistical concepts are related. For anything above that, using a computer to do the math, run simulations, etc., is where it's going to be better.
The other thing I'll say is that (maybe if you don't need the math), is that learning the math can help you understanding the concepts better because it's another connection in your head. The more connections you make, the better your understanding. I don't make new methods on my own, nor do I feel the need to. But when I'm read a new methods paper, how can I understand what's going on if I don't have that background?
Thanks for being the impetus for this video :) I love what you said about math being "another connection in your head."
Hand calculations and mental arithmetic on summarised numbers is great for getting a 'feel' for the problem (I'm slide rule old;-). The inability to see the scale and quantification of problems is a major issue (the old news headline about millions, billions and trillions whooshing straight over folks heads who haven't or can't make it 'real' e.g. £1 per person per year, vs £1 trillion national cost over 15 years programme!).
As someone around your age that only had a few courses of stats, economics, calc, linear algebra / modeling, conceptual physics, etc. this video was relatable; however, I always viewed undergrad courses as the "demo" version of games -- which I grew out of after college. You get a sense of what the subject is capable of with some neat tricks, before you decide if it's right for you.
The "google point" you speak of is what I refer to as exposure. When you come out of school, you might not ever use a formula for amortization, but you'll know it exists and you did something with it in a finance class. You've reached that "google point" of knowing what to look for, but not knowing how to do it. For me, "google" is a $200 textbook sitting on my shelf and remembering about which chapter we covered it in our course.
I enjoyed the various levels of Statistics classes I took. I wish I was more versed and retained as much as my grades said I knew, especially with age-old stats concept of "machine learning" being all the rage these days. The few examples you gave in these videos were great -- love the practical application that extends beyond theory! I want both -- a cliff's notes cookbook and a theory book.
Nice to hear what you said about recipe's, i recently came across them and thought i was getting left behind because i wasn't using them, but i did think similar to you
This was the best video I’ve seen from you, well done and thank you! A lot more ppl need to watch this!
Appreciate that!
In his autobiography, Feynman talked about very similar frustrations and insights with undergrad physics students in Brazil. I think it has a lot to do with the emphasis on high-stakes performance evaluations.
spot on. the consequences of having assessments dictate marks and frankly job prospects, status, money etc. sure this form of education system broadly implemented has done great things, but this is one of its major limitations
"The mathematics are not there for the joy of the analyst but because they are essential to the solution."
Karl Pearson, Notes on the History of Correlation (1920).
Current biostats TA (and resident person on the teaching team who actually likes math and stats and code) here, thank you so much for this. It's extremely validating to hear that people are putting into practice my ideal approach for teaching the subject as someone who loves it and has to be part of it getting taught in a way I think sucks. Most of the quants in my department have been campaigning for YEARS to make biostats a requirement to begin with, and, now that we finally have that, I desperately hope we'll take the next step to require taking R beforehand.
Professor, I can sense the frustration in your words. It feels like you're ahead of the curve, much like many thought leaders throughout history. From my perspective, you might be shaking up the status quo, which can make some within the profession feel insecure and uncomfortable. I hope you find a way to bridge the gap in understanding without bruising too many egos, as we've seen in the past how people like you can get blocked or crushed by the fragile majority protecting their interests. I truly appreciate your content, time, and effort.
Thank you!
I really appreciate this perspective especially from a statistics professor. I am interested in the philosophical reasoning behind statistical methods and find the many subdisciplines in math like biostats disregard assumptions, or use the wrong tests only because it was covered in their courses (of which I am guilty), analogous to cookbook style approach to intro stat (that I learned from too!). I am also still a student so maybe I am biased! love your work.
Oddly, that was my experience too when I worked as a biostatistician; most of the fellow biostatisticians were really cookbook-ey. It really annoyed me because my degree wasn't in biostats wasn't as appealing as a "pure" biostats degree.
Why requires true in depth understanding. How is just the mechanical approach. How is easier to teach. The school system is mostly based on memorized info and how to apply it.
The car question is more econometics question (or Ops Research-ish Linear Programming problem where the objective function can be customized by the buyer. So good point.
Data Envelopment Analysis, in which you can define input as cost and choose your own outputs (mpg, fun factor, etc.) Then calc the efficient frontier in potentially higher dimensions and find cars on that frontier. DEA is highly under utilized imo.
Yes, absolutely true... thanks 🙏 for this video
I was a TA, and we had no choice about what to cover or how to cover it. All sections had to proceed in regimented order so all sections were at the same point at the same time, covering the same material. I don't blame them for trying to control the presentation. There were a lot of students and a few inexperienced teachers, like me. But, your approach is outstanding! By the way, new subscriber here.
Yeah, that's been my problem--my department wants to standardize things and other teachers are uncomfortable with my approach.
First video of yours I watch. Very good content.
I recently enrolled in a very stats-intensive program. I will take your advise not only as knowledge, but wise words on how to approach knowledge better.
However, I can't help but notice that you never explained how to calculate the central limit theorem 😆
Ha! I don't even know how to answer that :)
9:30
That example of building a model with features of a car to predict price is actually a fairly common 'cookbook' problem in any regression textbook. Except I have seen it more in terms of predicting house price given area , number of bedrooms and number of bathrooms
That probably depends on the discipline. In my discipline (psychology), we'd never have an example like that.
@
I did Actuarial and Applied Maths. We did not heavily focus on specific applications. I would still think that I could pick up any statistics book and find a similar example in the multiple regression section (even in “stats for social sciences” which is not as rigorous as a full stem stats class).
I agree!
A Gaussian is a Poisson is a binomial, etc. Oh the arguments!
Today's example was estimating the 'mean' (arithmetic? average) for a wordle score, given (or not given) what to do about the 'failed' (6 guesses or more) count part of the stats and hence whether to assume some, as yet unknown, long tail. But really, what does it all mean (aka understanding), and is 3.8 actually better than 3.95 guesses?
Maybe I should of done a KL measure of the differences.
I feel ya! Every time I raise the point with my students and postdocs that they are doing it wrong and I hear "but everybody else does it this way!" I want to quit :)
I often feel like I will be referred to HR for "statistical harassment" ;)
Ha! Please don't let that happen to you. Once there's a precedence, I'll lose my job!
Cool video! As a Statistician I approve this message but calculating by hand (e.g., Satterwaite's df approx) helps me personally know the assumptions needed (and not just memorization). Though, I've learned through my education (just at an undergrad level though) and career the most important concept I've seen personally is interpretation. What are the needs of the SME? What are my assumptions? (some methods use algorithms/formulas that NEED certain assumptions to be met)
Ps - sorry for run-on sentences and incoherent written thoughts, I'm no English major haha
I have used Satterthwaite's approximation to the degrees of freedom for a two sample t-test without the equal variance assumption many times (in exercises). I do not see the value you give it. It is included in AP statistics with no derivation. Its inclusion teaches people the incorrect idea that the exact number of degrees of freedom is important in a t-test, where it is much less important than in a chi-squared test or F-test since those distributions don't converge as the degrees of freedom increase. What am I missing about Satterthwaite's formula?
@douglaszare1215 I'm not sure what you mean by not being valuable. Satherwaites approx is an attempt to get a better estimate of df when you dont have equal variances. Kind of like a weighted average in a way. Personally, it helps me understand cases where you have a linear combination of expected means squares and is helpful (at least to me) in the context of just trying to understand what's happening in mixed models (PROC MIXED in SAS). Also undestanding that things get tricky with small data sizes, unequal variances, and/or complex R side covariance structures. If I remember correctly, it's recommended to use the updated Kenword-Roger method (DF= KR2).
So studying the formulas and how/why the formulas are the way they are helps me when attempting to understand the limitations and not just pure memorization.
All that to say, it has helped me personally understand what's going on broadly speaking when problems arise. For example, I've come across situations where the df are so low, my kfactors are inflated. So that tells me to look at the individual contributors and make a judgement call. Is the population really like this or is it due to the limitations of my sample.
In retrospect, maybe I should have said that hand calculations might be good *after* you have a strong foundational knowledge (much as I said knowing the math is good after you know how to do it with software).
Also I love that you talk about interpretation. Yes! That's what's most important!
Happened to me recently during research. Sure, I can apply tests to data (non-parametric, parametric, testing for normality, you name it), but now I had to build a simulation using probabilistic neuron codes and decode the information in a circuit.
Now, this implies constructing tailor-made estimators, a skill which depends on a serious understanding of likelihood functions and yeah, I of course was never taught such a thing... So I had to self-learn as fast as possible, but of course my understanding is still restricted and not global.
This is very true of math in general. My bachelors and masters are both in math and I had a handful of topics I felt I really learned well and for the rest it was too detail/procedure oriented. This is true for how math is taught at all levels.
I think that's why I was always bad at math in high school. Because I'm a nerd, I read a book about something like the history of mathematics. It showed why Newton wanted to understand limits and how limits relate to derivatives. It blew my mind and made calc so much easier to understand.
4:37
“Statistics courses are ‘cookbookish’…”
I think lots of courses, and especially textbooks, are cookbookish in many ways. Here’s a way to address that in teaching:
Have students work in groups but don’t give them the ‘cookbooks’; instead, have them develop and write the ‘recipes’ to go into the ‘cookbooks’ themselves.
This requires a willingness to facilitate many groups without just giving the answers to any one group.
As with so many things, understanding the why is so much more important than understanding the what. In my AI/ML course I teach, we spend time visualizing and analyzing our data before we feed it to our models. What are the statistical models trying to tell us? which methodology is best and why?
If you know the why, then you can solve your own problems. With the internet, what becomes a simple lookup. I agree that you need to know the steps, but route memory adds no value other than speed. Speed has little value if you have problems with your data you need to solve.
Really good video, Dustin. Well done.
Love the view on teaching and your car example!
Welp TH-cam algorithm recommended me this. I started learning statistics since it's relevant for my job since I'm learning on the job to being a bioinformatician. I started out writing out basic very simple stats and beyond that i just use the programming languages I'm familiar with once I felt i had a good grasp of the concepts.
Hello Professor, please do a video on your take about Statistical modelling and what Machine learning mean to you.
I like this video and agree with most of it. I think introductory statistics classes at top schools typically are not as bad as you suggest. There are a few bad ones, but most spend quite some time discussing things other than procedures for statistical tests.
It makes me wonder what stats textbooks they're using. Most of the intro stats books do take this approach. (Mine and one other are the only exceptions I know of).
I did some reading and the knowledge necessary to understand statistics completely is a lot. (Calculus 1,2,3 Linear Algebra and obviously a book about Stochastic.) So understanding statistics might be a lot more work than most people are willing or able to do. Me included. It would take me years and then I would forget most of it.
I think you can get a solid foundation with out the math stuff.
Wenn man folgende Unterscheidung macht, können viele Probleme entschärft werden:
Vordergrund und Hintergrund.
Um ein Bild zu verstehen, benötigt man beides.
Mathematical probability & statistics is on another level. The problem is that you can’t expect undergraduates to have all the foundational mathematics to handle the multi-terms courses us graduate statisticians take. It’s an unreasonable ask.
The schools love to throw a “probability and statistics for engineers” course at us. I’m kinda scared lol I haven’t taken it yet.
@@tmann986 I would say you might appreciate it more if they relate probability and statistics to engineering problems. That is important because you can see where the rubber meets the road and that might motivate you to learn it.
My undergraduate courses were so abstract at the time because I couldn’t really relate the application of probability and statistics to anything I was interested in. I relied on Excel to do all my calculations. I definitely wasn’t integrating Gamma functions for fun. My later advanced undergraduate statistics courses applied statistics to modeling business problems which was something I could relate to. My graduate courses in my statistics degree were kind of all over the place in their applications. There’s no way around it. Probability and statistics takes years to master. Probability theory on its own is a beast to master at the graduate level.
@@jbj926 that’s a good point to relate the probability and statistical models to engineering problems. I’m sure it’s worth will come into significance if I so choose to go into graduate school for engineering. I do have a very abstract mind meaning I think I’m the only engineering student who likes writing proofs. I do like math for maths sake and I’m sure I would think the same for statistics and probability.
Man, I totally agree with your concern, I'm a School Mathematics Teacher major in Statistics....
That's where it starts!
"We have the population! We don't need to do tests if we have the population.."
Nice. Never thought of it that way.
Great video. I have not the same experience when teaching statistics as you describe though. But I think it depends who you are teaching it to. When teaching Mathematical Statistics for students doing Master of Science in Engineering och Computer Science this is not a problem because it's simply not taught in the "cookbook way" at all because we can expect the students to be prepared to do various creative logical thinking when faced with a problem. The whats and whys are crucial from the beginning there.
This problem you describe I find to be more prevalent in courses for statistics in educations that have less weight in mathematics and logic. Courses that are thrown in rather late in an educations like these to "prepare" student to be able to make their own statistical conclusions from papers they are about to write. I would argue the amount of misconceptions that arises from people incorrectly assuming they know statistics well from such a course is such big problem that it would probably be better if they weren't taught statistics at all. Instead they could just have someone else guiding them through the process the very few times they actually need to use it in their education.
Then if they actually need to use statistics a lot later on they should actually learn statistics more thoroughly.
I think much of this problem arises from the weird but common misconception that the processes in which you participate in when learning mathematics and logic is "something I will never use". This is often caused by many missing the point that it is not the procedure you need to learn that is the most important in mathematics but the very ability to reason and treat complicated problems in a structural and logical manner to infer conclusions you otherwise would have no way of inferring just by "gut feeling" and that outright seemed impossible to solve at a first glance.
Mathematics and problem solving in general helps you build up perseverance, resilience, patience, optimism and the habit of addressing a complicated problem by breaking it down to more but easier problems. These are traits that is probably more important now than ever and many seem to have little or none. Since so many have never tried to view mathematics in this way, they simply haven't developed these abilities enough and are used to a situation where if they cannot just see what they must do from the very start, they immediately assume they cannot solve the problem and just gives up. That's probably why we see all of these raised hands asking "Can you just tell us what we should do!".
Love this!
6:06 ‘bold enough to say’ or ‘so bold as to say’
Hey! I have a nominal categorical variable (say with 10 groups) and a continuous variable with data size 150. I am finding it difficult to calculate the relationship between the two. Can you please help suggesting any method to get a metric of association between the two variables ?
I've been teaching statistics for over 15 years and I've never encountered this problem.
which problem?
When you talked about how the collage had the population data i think the correct interpretation is ratter that each class is cohort of students that is a kind of a not random sample from people how can attend the collage, in that sense it's not the population data. In a way, similar to welding, each student is a sample from the process of education, you start with a sample then apply a treatment and then measure the outcomes with the relevant measures.
One way i think would be interesting to explore such data would be to use high school performance as a priori to ability, and study how each class improves over the baseline.
If you ask me, I think the answer to the "riddle" ultimately comes out to simply describing the data. That's certainly the process we use for the Census, for example, and isn't that quite literally a study of the population?
I'd say it depends on what you want to actually do with this data. For instance, based on the population data from a particular school one can say with certainty that their average GPA is, for instance, 2.6. Or that the school has more men than women (presuming gender is set as binary cateorical variable rather than being a bimodal spectrum). You wouldn't really need a hypothesis to test.
But if you want to figure out what happens when you introduce a new student in this school, or what this dataset can tell you about cohort of students the next year, then yeah, you'd treat it as a sample to infer trends.
Dustin, do you have a paper version of the book yet?
no, I'm a lazy bitch. I'll get there!
I have felt this way about many topics being taught. A great problem.
(Feedback: the music is highly distracting at times. Either too loud or too disruptive a composition.)
Thanks for the feedback!
Hey Justin always love you content. So if I’m on the ‘consultant pathway’ does your course support me being able to make sense of data in R so I can help people make decision ‘mainly social science employee stuff’? I have mid level software background and have got the basics in R (your content has been amazing for that, and Hadley Wickham’s book…) but how much stats do I need, I love your concept of learn the why not the steps. This makes sense to me. ❤
That's really what my curriculum is designed for. I teach clinical psychologists, so I'm definitely more on the applied side.
what is a ru? (roo??)
ru? Did I use that term in the video? Sorry, I forget what I said in videos immediately after I publish them :)
Ever thought about coming back to teach at the Y?
I did interview there in like 2015, but they didn't select me :) As of now, I'm happy here!
Often, we only have convenience sampling-and not even a clear picture of the underlying population. As I see it, there is hardly any reason to conduct significance testing in such cases.
Definitely a fair point. My dissertation was actually about how convenience sampling screws up probability estimates.
"There are no routine statistical questions, only questionable statistical routines."
Sir David Roxbee Cox
Love it!
As seen from a math background, stats already appear like an obscure heap of recipes with a dab of superstition, a hybrid between cooking and alchemy. I suspect the cookbook approach plays a role in my perception.
That said, cookbook teaching is pervasive. It sounds ideal from the teacher's point of view because you can tell the students: 'this is very simple, just memorize and follow those steps, anyone can do it, you don't need to be Ramanujan'. And then the teacher is baffled when the student gets everything mixed up and/or completely freezes at step 3, even though they can regurgitate step 3 by heart.
My hunch is that humans have evolved to be problem solvers, not machines. Our brains switch off when we are asked to behave like machines, as if our brains had a built-in safeguard that prevents us from doing something if we don't understand why.
The consequence is that this safety feature has to be overridden if we we want to learn to do something mechanically without understanding what goes on behind the scenes. It's feasible but it's extremely laborious (think long division).
LOVE this. Especially this part: "My hunch is that humans have evolved to be problem solvers, not machines. Our brains switch off when we are asked to behave like machines, as if our brains had a built-in safeguard that prevents us from doing something if we don't understand why"
At 8:10 in the video when you say that there's no need to test anything when we have information about the whole population - this is debatable. It is true that there's no need for inferential statistics to infer conclusions to the population. But one could argue that there still exist a bunch of questions that has to be answered with the use of statistical inference. For instance, are girls expected to perform better than males in a particular course? We do have the whole population so we can answer "did girls perform better...". But to answer the question whether girls can be said to be expected to perform better, we need statistical inference!
True. Their specific question wasn't about future performance though...it was about current and past performance, of which we did have the population.
Where do you teach? If it's a college/university do y'all let people audit courses without being enrolled in the school?
Rowan university. I doubt they would let you, but you can take a class at simplistics.net
Statistically speaking, people do not like statistics on average.
that's mean ;)
Nobody taught me statistics. Only thing I know is this garbage cookbook I have no choice but to use because people only tasted (understand) cookbook recipes and need meals would be met with confusion. Please let me cook, but I don't know how!
Get rid of the music and I'll see what you have to say.
Tools of approximation where interactions are important. Put deterministic measure here pushes complexity there.
Absorb the complexity along the in step by step intellectual anylitical parts.
When cos/ sin A cutting speed + b feed rate fails for no reason it's still sigma 6 profitable you just scrap the parts play musical chairs of super position to get the answers you want blame hidden variables of many world atomized phase changes or messy molecular structures and bonds maybe thermodynamical environmental decay issues lol.
What you won't do is send xyz manmade time hierarchy knowledge of good evil equations back to Academia to fix.
They will re set new frame of reference trash can the problems.
Spot on!
First, medicine. I was shocked that doctors perform diagnoses by memorizing STEPS! And assumed "brightness" is based on one's ability to memorize STEPS!
Doctors rarely talk or listen to patients anymore; they run tests which follow STEPS, and often misdiagnose because their assumptions were wrong, so their tests were wrong, so the STEPS were wrong, so the outcome was wrong.
Same thing with statistics! I've worked with Ivy League MD/PhD Biostatisticians for over 40 years, and I'm astounded at how little Stats they know; and I'm not a Statistician!
Further, whenever I press these MD/PhD folks to explain why they're doing what they are doing, very few could give plausible reasons - BECAUSE THEY WERE FOLLOWING STEPS! And these STEPS in Clinical Trials rarely change so there's even less of a need to under WHY things are being done in a particular way.
STEPS result in COOKIE-CUTTER approaches that are more suited to BAKERS. 😳
Education, in my humble view, is all screwed up. The focus of education has been sullied by simply trying to get a job instead of LEARNING.
Learning means asking questions, lots of questions, but nobody wants to understand anymore; just show me how, just show me the STEPS so I can pass the exam and get a job and make money.
There's another thing that I've never been able to digest when it comes to learning Mathematics and Statistics, and I promise I'll end here. It's the stupid names that are meaningless or misleading that have forever been assigned to things - like a P-VALUE! It took me 40 years to finally accept the term "p-value"! So, so, so very UN-INTUITIVE! In fact, it's borderline silly, and it gets in the way of learning.
Okay, I'm done. Sorry. But STEPS and stupid terminology get to me.
Thanks for your kind and understanding reply.
Briefly, I must first confess that I'm a poor student at Mathematics and Statistics; I mean really poor. I want to believe that it's because of how I was taught, but I don't want to shift the blame to someone else, at least not totally. What I do know is that when I teach myself, I can learn, and that's because I can take my own time to ask myself as many questions as possible, and not feel rushed about having to pass an upcoming exam.
Sorry to digress, but now on to "p-value". In Clinical Trials, as you may already know, it's all about p-values, or at least that's how it appears to me (a non-statistician). In most cases in Clinical Trials, p-values are used when the data (drugs being tested) are blinded, and during which time Statistical Programs are being developed. Yet, the question keeps coming, non-stop, about p-values. As a non-statistician, I keep wanting to blurt "it doesn't matter, the data is blinded, p-values have no real meaning at this point," but I keep quiet because I'm a non-statistician, and I don't want to get fired. Literally.
And truth be told, maybe I don't fully understand what is going on as a non-statistician, so another reason to hush.
P-value means probability value, but as a non-statistician, the furthest I could get is that the meaning of probability is just like flipping a coin or rolling a die. And intuitively, the higher the number the better the probability; 3 out of 6 is better than 1 out of 6. So, as a non-statistician, why do we want to celebrate SMALL P-VALUES? In my head, small probability numbers mean we're doing really bad. But not so with p-values, the smaller the better. 😳
Took me ages, and lots of reading to finally get and accept the name: "p-value".
Sorry to take so long, but there's another problem with naming something by ITS SAME NAME. It's like saying something like "a tree is a tree that's tall". It's repeating the term that's trying to be explained (the sky is the sky with the clouds).
Honestly, this is my last bit of explanation. Calling something a "probability value" could apply to every single probability value that is calculated in statistics, at least to a non-statistician; with a die, 3 out of 6 is a probability value, with a coin, 1 out of 2 is a probability value, so just about everything in statistics is a probability value, so what's so special about a "p-values" that it should be given a special name and meaning?
I don't consider myself right or wrong, it's the name that was chosen for a value that doesn't have any intuitive suggestion as to what it is. I have the same issue with calling something "percent change" when it's real name is "mean percent change". They're different... at least in my head as a non-statistician. And this blocks learning just like STEPS, STEPS, AND MORE STEPS.
Interesting! I didn't realize doctors had the same problem. And as for stupid names--Yes! The worst example I can think of is degrees of freedom. It's such an esoteric concept that is unnecessarily complicated.
@@QuantPsych Absolutely correct 💯🎯 as to degrees of freedom.
To put a spin of humour on the seriousness of STEPS and poor choices to describe Statistical stuff, I'd say it's probably exactly what happened to that person you shared who said "WE HAVE THE POPULATION", hence no need for any type of statistical inferences as we know everything already. 😆 The word "POPULATION" has special meaning in Stats as compared to the general use of the term used by the general public. To elaborate for a second, the STUDENT POPULATION of a school has one meaning in every day life, but a very specific, and scientific meaning in Statistics.
That person, a teacher of Stats for many years if memory serves me correctly, had a totally street understanding of POPULATION. 😆
Honestly, I wouldn't know if to laugh or cry were I to be faced with this declaration in a public setting. 😳
Stay well and keep teaching. We need you, and people like you. Literally. Cheers.
I don't think that the education does this in bad faith... I think that education, as it is now, try to get broad knowledge of everything but not very deep... But at least you get an answer
I agree it's not in bad faith, but it's just the wrong approach. You'll get a broader knowledge of stats if you start from models.
Typical (poor) college level stat courses leave out 1. What the terms actually mean and 2. Really big data. Probably 1 is more important.
Except degrees of freedom. Very few know what it means and very few need to know :)
The music is absolutely awful. Just why?
Because I'm targeting you specifically.
Statistics is last resort mathematics
I've always seen stats as math with error.
28 minutes to say that learning what to do without why we do it is not the path to true mastery. sheesh.