Linear Regression, Clearly Explained!!!

StatQuest with Josh Starmer

มุมมอง 317 565

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 15 ม.ค. 2025

ความคิดเห็น • 284

@statquest 2 ปีที่แล้ว ⁺²⁷
NOTE: 25:39 I should have (Pfit - Pmean) instead of the other way around.
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
@tonysvlogs881 ปีที่แล้ว ⁺⁶⁵
I struggled understanding this topic through a textbook/ professor videos online, and this was just a great explanation. It was like watching this video, made all the pieces finally fit
@statquest ปีที่แล้ว ⁺³
Hooray! :)
@Bang-_-Bang ปีที่แล้ว
Yo bruh seriously I don't understand anything 😭😞
@ricardorodrigues4726 3 หลายเดือนก่อน ⁺¹
The trick to read hard books is to completely ignore the over detailed math explanation on a topic you don't understand. Why? Because Math needs to be thorough and in doing so it over complicates. I can't tell you how many times when I was starting, I was struggling to understand an algorithm because I was reading the math of it and then I would ask for help from a teacher or collegue, which would explain to me in ENGLISH, what the algorithm did, then it become obvious and the math too afterwards. In any Computer Science field that shows proofs or uses math to explain concepts, completely ignore it, learn the concept first, the math will follow.
@NamNguyễnHoài-f2s ปีที่แล้ว ⁺²⁸
I'm an electrical engineer who wanted to learn about machine learning, and your videos helped me understand all the fundamentals of this field. Thank you so much, sir
@statquest ปีที่แล้ว ⁺²
Happy to help! :)
@infamousprince88 ปีที่แล้ว ⁺³⁴
This assisted me in delivering a presentation for a job interview -- landed the opportunity.
Thanks!
@statquest ปีที่แล้ว ⁺¹⁴
TRIPLE BAM!!! Congratulations!!! :)
@fooballers7883 10 หลายเดือนก่อน ⁺¹⁴
I wish I had your lecture 50 yrs ago.... never too late learning it again today. thank you
@statquest 10 หลายเดือนก่อน
Thanks!
@undeadsatan3317 ปีที่แล้ว ⁺⁹⁰
I'm in my stats class but watching this instead of listening to my professor lol 💀
@statquest ปีที่แล้ว ⁺²⁵
bam!
@bread_enjoyer ปีที่แล้ว ⁺⁴
Double bam!
@thejasnair9424 11 หลายเดือนก่อน ⁺³
Ternary Bam!!!
@Alappavan 10 หลายเดือนก่อน ⁺⁵
Qudary bam!!!!
@justhydr 10 หลายเดือนก่อน ⁺⁵
Penta bam!
@imakechannel ปีที่แล้ว ⁺¹³
I struggle understanding this topic but it is Great to learn from someone who can explain things in a simple manner with eloquence
@statquest ปีที่แล้ว ⁺¹
Thanks!
@a.qais6697 ปีที่แล้ว ⁺³
@@statquest Agreed. You articulate well and make the subject simple and easy to understand.
@jenwilson7779 ปีที่แล้ว ⁺¹
Thanks!
@statquest ปีที่แล้ว
BAM!!! Thank you so much for supporting StatQuest!!!
@jenwilson7779 ปีที่แล้ว ⁺²
Of course! I am the person who is embarrassed on the inside that I don't get the stats terms when thrown around at work, but know that I'm memorized them so know what they are, but really don't understand the "why" or how it all relates. Thank you so much for speaking slowly in your videos, reiterating concepts, sometimes with additional concepts in between, and your humor. It's fun. I'm grateful. @@statquest
@ioanamihai4368 2 ปีที่แล้ว ⁺⁸
Wow...i was searching for this on your channel last week and I was so sad I didnt find it... luckily i still have time to study for the test. Thank you!
@statquest 2 ปีที่แล้ว ⁺²
Good luck! :)
@kimiko495 10 หลายเดือนก่อน ⁺²
wow this make so much sense! I'm pissed why college professors don't teach like this, it was a waste of time to sit in their classes being so confused right from the start. I can't thank you enough for your videos!
@statquest 10 หลายเดือนก่อน
Thank you!
@Épicesdelien 10 หลายเดือนก่อน ⁺²
Love the musical introduction. Such a nice touch to prime you beforehand :)
@statquest 10 หลายเดือนก่อน
Thank you!
@NaderNabilart 2 ปีที่แล้ว ⁺⁵
Great work! The graphics made it super easy to understand.
@statquest 2 ปีที่แล้ว ⁺¹
Glad it helped!
@user-bz7fj1fk2m ปีที่แล้ว ⁺³
10QUVM for your valuable presentation!!! You made me feel proud in my STAT!!!
@statquest ปีที่แล้ว
BAM! :)
@DSharma117 8 หลายเดือนก่อน ⁺²
Thanks Josh, your channel is recommended from Murdoch University,Australia lecturers. Worth watching your channel
@statquest 8 หลายเดือนก่อน ⁺¹
Thanks!
@AbhiSarangan 7 หลายเดือนก่อน ⁺²
I would be lost without this channel
@statquest 7 หลายเดือนก่อน ⁺¹
bam! :)
@jamesahn3865 2 ปีที่แล้ว ⁺²
I had to buy a study guide book after watching this video...! This is a great video!!
@statquest 2 ปีที่แล้ว
Thank you so much for your support!
@anlinli6463 ปีที่แล้ว ⁺⁴
Thank you Josh! You are truly helping me with the difficult reviewers' comments🤣.
@statquest ปีที่แล้ว
Good luck!
@fabslyrics 9 หลายเดือนก่อน ⁺²
thank you friendly folks of the genetics departement of NC Chapel Hill , greetings from Paris France.
@statquest 9 หลายเดือนก่อน
Thank you!
@minhvule1608 2 หลายเดือนก่อน ⁺¹
Never forget to hit a like to the videos of this channel. It's totally worth it.
@statquest 2 หลายเดือนก่อน
Bam!
@antonyshadowbanned 2 ปีที่แล้ว ⁺²
You are indeed a God among mortals. And as such you shall be praised. Tons of gratitude for blessing us with your pristine insight Father Majesty.
@statquest 2 ปีที่แล้ว ⁺¹
Wow, thank you!
@lmina3425 หลายเดือนก่อน ⁺²
this is just brilliant work!! thank you very much and pls continue teaching :)
@statquest หลายเดือนก่อน
Thank you! Will do!
@montasiraffan-cu7xk 3 หลายเดือนก่อน ⁺¹
literally top notch i have ever seen.
thanks man
@statquest 3 หลายเดือนก่อน
Thank you!
@bhargav1811 2 ปีที่แล้ว ⁺³
This was truly advanced concept for me !!! :)
@statquest 2 ปีที่แล้ว ⁺¹
You can do it! :)
@SofiaBuyanova 2 ปีที่แล้ว ⁺⁴
Thank you for the great video! Please note that from the second 25:49 the degrees of freedom for the numerator should be (Pfit-Pmean), otherwise it is less than 0.
@statquest 2 ปีที่แล้ว ⁺¹
Thanks! In theory TH-cam is supposed to alert people of that typo, but maybe it doesn't always work. (I just tried it and it worked for me).
@lizs7827 7 หลายเดือนก่อน ⁺¹
Awesome video, thank you Prof. Josh!!!!!
@statquest 7 หลายเดือนก่อน ⁺¹
Thank you!
@muhammedfarispk1687 ปีที่แล้ว ⁺¹
I am enjoying this teaching method 😍
@statquest ปีที่แล้ว
Thank you!
@muntazirabidi ปีที่แล้ว ⁺¹
Thank you. Wonderfully explained!!
@statquest ปีที่แล้ว ⁺¹
Glad it was helpful!
@penguinmonk7661 ปีที่แล้ว ⁺¹
I always have a good time with Statquest :3
@statquest ปีที่แล้ว
bam!
@krishnendusinha4409 ปีที่แล้ว ⁺³
Your videos are awesome! Thanks a lot for making complex concepts simpler. It will be helpful if you clearly explained Discrete probability distributions
@statquest ปีที่แล้ว
I cover the binomial here: th-cam.com/video/J8jNoF-K8E8/w-d-xo.html
@_hhbk2128 13 วันที่ผ่านมา ⁺¹
Great explaination! But I just came from r squared = sse/sse+ssr video and this is adding into my confusion. I'm trying to connect the dots between the two.
@statquest 13 วันที่ผ่านมา ⁺¹
R-squared quantifies how good the relationship between the variables on the x and y axes. The p-value we calculate in this video helps us decide how much faith we should have in that relationship.
@marm_sam_bamb 9 หลายเดือนก่อน ⁺¹
Awesome channel! I just bought your book too!
@statquest 9 หลายเดือนก่อน
TRIPLE BAM!!! Thank you very much for supporting StatQuest!!!
@Leera56 6 หลายเดือนก่อน ⁺¹
Thank God i came accross your videos. Making my CFA journey towards statistics less overwhelming by explaining like you are explaining to a 5 year old...pheeewww.
@statquest 6 หลายเดือนก่อน
You got this!
@lynnamanda4093 ปีที่แล้ว ⁺¹
Thank you so much Josh !
@statquest ปีที่แล้ว
Thanks!
@isaachiew4906 2 หลายเดือนก่อน ⁺¹
Hi there, is there a playlist compiling a list of videos of yours relating to machine learning?
@statquest 2 หลายเดือนก่อน
Yes, you can find everything (including playlists) here: statquest.org/video-index/
@rahoolmahool-programming5499 ปีที่แล้ว ⁺¹
I got pregnant two times while learning SGD from you. This is the hundredth time i'm jumping from a video to another video.
@statquest ปีที่แล้ว ⁺¹
ok
@mahammadodj 2 ปีที่แล้ว ⁺²
Does n equals to the number of data points in F equation? For example, we should take 9 for n in 22:40 ?
@statquest 2 ปีที่แล้ว ⁺¹
Yes
@anelazikic5114 6 หลายเดือนก่อน ⁺¹
Thank you so much for this video
@statquest 6 หลายเดือนก่อน
Happy to help!
@Jason-o5s 4 หลายเดือนก่อน
Cheer~~~arranged in or extending along a straight or nearly straight line.😊
@statquest 4 หลายเดือนก่อน
double :)
@LeBlayout 2 หลายเดือนก่อน ⁺¹
Thanks Josh Starmer
@statquest 2 หลายเดือนก่อน ⁺¹
Bam! :)
@javo11 หลายเดือนก่อน ⁺¹
Great! Thank you!
@statquest หลายเดือนก่อน
Thanks!
@Jiacancan 9 หลายเดือนก่อน ⁺¹
thank you. that was very clear
@statquest 9 หลายเดือนก่อน
Thanks!
@B-hooktuber 9 หลายเดือนก่อน ⁺¹
Cool merch you could probably easily create would be a workbook to pair with your book where we could practice calculating R2 for exemple in different scenarios. That way, everytime you learn a new concept you can practice doing the formulas :) i'd totally buy that 😏 and maybe links to extra videos or explainations on the concepts that are a little harder to comprehend for people that are completely new to this field and a little slow lol(like linear regression 😅)
@statquest 9 หลายเดือนก่อน
That's a great idea!
@12PEN12 ปีที่แล้ว ⁺¹
Hats off to StatQuest!!!
@statquest ปีที่แล้ว
Thank you!
@toribentley2409 5 หลายเดือนก่อน
this is amazing, thanks a lot @statquest , please can you also do a video of linear mixed models and generalised linear mixed models, there a few videos about them on TH-cam, it would really helpful. Thank you for the good work
@statquest 5 หลายเดือนก่อน
I'll keep those topics in mind.
@kushagrastripathi ปีที่แล้ว ⁺¹
Very helpful. Thank you
@statquest ปีที่แล้ว
Thanks!
@umasingh3601 6 หลายเดือนก่อน ⁺¹
Best explanation ❤
@statquest 6 หลายเดือนก่อน
Thanks a lot 😊!
@mathematics6199 10 หลายเดือนก่อน ⁺¹
Hey hi, R squared can be negative as well right?
@statquest 10 หลายเดือนก่อน ⁺¹
Not in the context of linear regression. In other contexts, though, it can be.
@mathematics6199 10 หลายเดือนก่อน ⁺¹
@@statquest R^2 is just a metric right, and I can set the coefficients of independent variables in such a way that variance(error) exceeds variance(y),( as variance(error) = variance(y* - y), (where y* is the infered value, and y is the actual value) , I can always make y*-y infinitely high for one datapoint, by choosing appropriate coefficients ), or am I wrong? Please correct me.
@statquest 10 หลายเดือนก่อน ⁺¹
@@mathematics6199 Yes, in theory, you can do that - but that's not linear regression. In linear regression we don't just set the coefficients to whatever we want. We set them so that they minimize the sum of the squared residuals. And this is why R^2 isn't negative in this context. However, in other contexts, where you can do whatever you want, yes, it can be negative.
@mathematics6199 10 หลายเดือนก่อน
@@statquest Thank you so much.
@johnlemon1595 ปีที่แล้ว ⁺¹
Hi josh, while getting to R^2, you give the formula y= (data-mean)^2. This contradicts your StatQuest "Fitting a line to the data", where your formula was "(b-y1)^2+(b-y2)^2+...", meaning "(intersect-data)^2. Now i already understood that by squaring the difference you get the same positive value, so the order doesn't matter for this purpose. Is there another reason why you put it in the order "(data-mean)^2" in this video?
Thanks. Love the videos, just watching for fun
@statquest ปีที่แล้ว
Since order doesn't matter, it's hard for me to remember to be consistent.
@johnlemon1595 ปีที่แล้ว ⁺¹
Okay great, just was wondering if i was missing something here @@statquest
@JeanLouisKali 5 หลายเดือนก่อน
This is the best explanation I have ever come accross on Linear Regression. I have a much better intuitive understanding of what the mathematic formulas I was exposed to mean. I do have a question. At 21:42, should the numerator be interpreted as [SS(mean) -SS(fit)]/(Pfit - Pmean) or is it SS(mean) - [SS(fit)/(Pfit - Pmean)] ? The position of the square brackets is not clear to me. Kindly clarify.
@statquest 5 หลายเดือนก่อน ⁺¹
It's the former. It should be [SS(mean) -SS(fit)]/(Pfit - Pmean)
@ashutoshshrivastava1305 2 ปีที่แล้ว ⁺¹
Amazing explanation
@statquest 2 ปีที่แล้ว
Thanks!
@looklook6075 11 หลายเดือนก่อน ⁺²
I was always wondering why the model chooses to use R2 rather than absolute value of R, until you draw that polynomial out of all sum of squares. It makes sense now
@statquest 11 หลายเดือนก่อน
Hooray!
@gnosmik 2 ปีที่แล้ว ⁺²
This is an excellent video Josh, thank you! I understand all well until you explain about p-value 23:58. So we were using a dataset of mouse size/weight and weight/tail length/body length, but I'm confusing where the 'random dataset' comes from when you calculate p-value. Could you explain a bit further about this please?
@statquest 2 ปีที่แล้ว ⁺⁶
The idea is to give you an intuitive sense of what the p-values associated with linear regression represent. So, to start with, we had 9 data points (9 pairs of weight/height measurements) and fitted a line to it and calculated the F value. That is the "observed" F value generated from the original, raw data. Now pair 9 random values for height (and these could be any reasonable values for height that you randomly select) with 9 random values for weight (and these could be any reasonable values for weight). Calculate the F for those pairs of random values and put that in a histogram. Then repeat until we've done that a lot of times and compare the observed F value from the original data to the histogram.
@gnosmik 2 ปีที่แล้ว
@@statquest Thanks for explaining all. Much appreciate it. So those 'random values' are completely random, just made up within the range of the normal dataset, right? Then when we are calculating F and p values in SPSS or R, do those softwares go through this process? It might be a bit silly questions, hopefully I'm not too far away!
@statquest 2 ปีที่แล้ว ⁺⁴
@@gnosmik That's the idea. However, as mentioned at 25:26, in practice, people (and software) just use an F-distribution (which is an equation for a curved line) to calculate the p-value. The idea of using random data is just to give you an intuition of what the curved line created by the F-distribution represents.
@gnosmik 2 ปีที่แล้ว ⁺¹
@@statquest Excellent! Thanks Josh
@simplemindedperson ปีที่แล้ว
Thank you for the nice video! I wonder for your explanation to the F curves around 25:53, shouldn't it be (p_{fit} - p_{mean})=1? In addition, would you please provide the link to your video about the degrees of freedom if that is already available?
@statquest ปีที่แล้ว ⁺²
Yes, that is a typo. And, unfortunately, I haven't made the degrees of freedom video yet. However, it's still on the todo list.
@simplemindedperson ปีที่แล้ว ⁺¹
@@statquest Thank you! I look foreward to your new ones
@jix8874 ปีที่แล้ว ⁺¹
@@statquest looking forward to the degrees of freedom video too!
@qaziaman8194 หลายเดือนก่อน ⁺¹
If you could please make a video on assumptions of linear regression, that would be helpful.
@statquest หลายเดือนก่อน
I'll keep that in mind.
@blahblah_06 ปีที่แล้ว
Around which point do we rotate the line ????????
Beautiful lecture..really easy to understand
@statquest ปีที่แล้ว
There are two different ways to fit the line to data. The one most commonly used is to simply do the math and solve for the optimal fit (take the derivative with respect to the squared residuals and solve for where it is equal to 0). However, that method only works in this specific situation. A more general method is based on the "rotate the line approach" that I illustrate in this video. To learn more about it (how to rotate the line), see my video on Gradient Descent: th-cam.com/video/sDv4f4s2SB8/w-d-xo.html
@МаксЧерн-х2л ปีที่แล้ว ⁺¹
Thank you ever so much!
@statquest ปีที่แล้ว
You're very welcome!
@theolau7335 ปีที่แล้ว ⁺¹
Very nice, thank you
@statquest ปีที่แล้ว
:)
@aitorolaso1352 ปีที่แล้ว ⁺¹
absolute masterpiece
@statquest ปีที่แล้ว
Thank you!
@mmkvhornet7522 8 หลายเดือนก่อน ⁺¹
thanks for the video
@statquest 8 หลายเดือนก่อน
You're welcome!
@timbui5556 ปีที่แล้ว ⁺¹
Thank you for making this series of statistic videos. One question please: I want to calculate the least squares growth rate of sales for a company. Would I have "higher quality" growth rate by using quarterly sales (40 pieces of data) vs. annual sales (10 pieces of data). Would the seasonality (Christmas sales higher) affects of quarterly sales and distort the growth rate? Thanks,
@statquest ปีที่แล้ว ⁺¹
It sort of depends on how exactly you want to model and what you want to get out of the model. If you want to take seasonality into account, then you need to fit a periodic function (like a sine function) to your quarterly data. That said, the easiest thing to do would be to start with annual sales and see how useful that is.
@timbui5556 ปีที่แล้ว ⁺¹
@@statquest Thank you so much for taking the time to answer my question!
@muhammadomarkhayyamkhan3593 8 หลายเดือนก่อน
Your explanations are wonderful. Please just recommend the book should be studied with your videos. Please make videos on chi-Squared distribution, Monte Carlo Simulations and Hypotheses testing.
Thanks for your valuable help.
@statquest 8 หลายเดือนก่อน
My favorite book to go along with my videos is The StatQuest Illustrated Guide To Machine Learning. You can get it here: statquest.org/statquest-store/
@utku_bambu 2 ปีที่แล้ว ⁺²
thank you for this
@statquest 2 ปีที่แล้ว
Thanks!
@lattoufj 9 หลายเดือนก่อน
Hi Josh,
Very nice video!
Shouldn't the distances from the points to the line be a perpendicular?
@statquest 8 หลายเดือนก่อน
If they were perpendicular, than we would lose the relationship between the variable on the x-axis and the variable on the y-axis, and the whole point is to use an x-axis value to predict a y-axis value. Thus, the residuals are parallel with the y-axis - this preserves the relationship that we want to use to make predictions.
@prithvidhyani2002 9 หลายเดือนก่อน
Great video overall! But I'm a little confused with your description of calculating a p-value for the R^2. Does this mean we are treating R^2 as a random variable itself and looking at its distribution? Because to me it seems like it is the f-statistic that follows an f-distribution, hence we are calculating a p-value for the f-stat, not the R^2 itself, which(correct me if I'm wrong) does not follow any specific distribution. So what exactly is the connection between the R^2 and the f-stat and its corresponding p-value?
@statquest 9 หลายเดือนก่อน
The f-statistic is what we use to calculate the p-value for the r-squared.
@JasonKaros 2 ปีที่แล้ว ⁺¹
Why was the original Linear Regression video removed for this one? Is the information of this more accurate or clearer?
@statquest 2 ปีที่แล้ว ⁺¹³
Without telling me, TH-cam put the original video behind a paywall, so re-uploaded it so it would still be free
@khoiphamang4166 2 ปีที่แล้ว
I have a question, in 5:24 why the variance is calculated dividing by n instead of n-1, I thought all the observed data points are just a sample of a bigger population includes data points which we haven't observed yet. I'm sorry if my English confuse you because it isn't my mother tongue
@statquest 2 ปีที่แล้ว ⁺²
In this context, the way we use variation means that denominator will cancel out, so it really doesn't matter which one (n or n-1) we use.
@kuraldeepdives9319 ปีที่แล้ว
@26:21 Should the curves say ( P fit- P mean)=1 ?
@statquest ปีที่แล้ว ⁺¹
Yes! That's funny that it's been like that forever, but you finally caught it. Thanks!
@kuraldeepdives9319 ปีที่แล้ว ⁺¹
@@statquest Haha the credit goes to you for teaching the concepts so well to a newbie! BAM! 😁
@derekc.5063 7 หลายเดือนก่อน
At 15:15, how does least squares cause any useless variable to be multiplied by 0? I thought Lasso regression excludes variables.
@statquest 7 หลายเดือนก่อน
Least squares can do it in principle, but not very well. Lasso is much more effective, and lasso also works when there are more variables than data.
@catcen9631 2 ปีที่แล้ว ⁺¹
insanely good video
@statquest 2 ปีที่แล้ว ⁺¹
Thank you! :)
@lilysun1296 2 ปีที่แล้ว
Thanks for the video. Could you please explain more why SS(fit)/(n-pfit) instead of n here 22:48? Thanks a lot.
@statquest 2 ปีที่แล้ว ⁺¹
This has to do with "degrees of freedom" and one day I hope to cover that topic in full.
@zauraiz 2 ปีที่แล้ว ⁺¹
@@statquest Looking forward to the degrees of freedom video! Parameters have always been a confusing topic for me
@ritubhatt7367 2 ปีที่แล้ว ⁺¹
I am not able to find the video 'Fitting a line to the data'
@statquest 2 ปีที่แล้ว
I have contacted TH-cam about this problem, but, unfortunately, they are all on vacation until next week. :( The good news is that this video does a pretty good job summarizing the concepts in that other video.
@ajalanbrown2200 11 หลายเดือนก่อน ⁺¹
i had to like just because of the song
@statquest 11 หลายเดือนก่อน
bam! :)
@Tatya1905 9 หลายเดือนก่อน
What is the value n (that was mentioned while explaining the degrees of freedom)?
@statquest 9 หลายเดือนก่อน
n = the number of data points in the graph.
@joshuaaddo1609 ปีที่แล้ว ⁺¹
This is great
@statquest ปีที่แล้ว
Thanks!
@t217265 3 หลายเดือนก่อน ⁺¹
I'm so confused. Am i supposed to draw the squares? Where are the squares? I need help 😢. I'm never going to pass this class.
@statquest 3 หลายเดือนก่อน
This video attempts to explain the concepts behind how linear regression works. However, you don't actually do these things in practice. In practice you use a program, like R, to do it for you. For details, see: th-cam.com/video/u1cc1r_Y7M0/w-d-xo.html
@에헤헿-l7v ปีที่แล้ว
I don't understand why least squares can cause any term that will make ss(fit) worse to be multiplied by 0. Is it because mean squares differential the equation?
15:20
@에헤헿-l7v ปีที่แล้ว
or is it because things like ridge regression can shrink the coefficients to 0?
@statquest ปีที่แล้ว
Least squares minimizes the sum of the squared residuals and if setting a parameter = 0 reduces the SSR, then that's what will happen.
@abdullahs9500 ปีที่แล้ว ⁺¹
That was a really mice explanation.. Thank you!
@statquest ปีที่แล้ว
Ha! Nice one! :)
@abdullahs9500 ปีที่แล้ว ⁺¹
🌹😄
@streampunksheep ปีที่แล้ว
I am going to statquest Isle!~
@AutoDisheep ปีที่แล้ว
The greatest island on earth!
@statquest ปีที่แล้ว
Bam!!
@VirtuosicBeats ปีที่แล้ว
Awesome, but can we do this without squaring? Why can't we just sum the residuals without any squaring, it looks like it should give us the sum of all distances and then we could plot it in the same way and pick the rotation that gives us the least sum of non-squared residuals and it should still work, curious why do we choose to square it, thank you so much for the video
@statquest ปีที่แล้ว ⁺¹
If the "distances" below the line are negative, they will cancel out the ones above them, so that's a problem. However, we could then take the absolute value so that everything is positive. This could work if Linear Regression was actually solved the way I've presented it here. However, in practice, when you square the distances, you can solve for the optimal parameters directly by taking the derivative of the squared residuals with respect to each parameter, setting those derivatives equal to 0 and then solving for the parameter values.
@VirtuosicBeats ปีที่แล้ว ⁺¹
@@statquest Thank you so much , it makes sense now
@Slayer1407-d9d 11 หลายเดือนก่อน
Question. Why are we calculating R2 value and the p value? Is it the industry standard? Or else What led to the decision that you included it with linear regression. Theoretically Lin reg is complete before that right?(Making concepts clear)
@statquest 11 หลายเดือนก่อน ⁺¹
If you just want to fit a line to data, you can used the method of least squares. However, if you want to quantify how well that line fits your data, then you use Linear Regression. Linear Regression consists of using least squares to fit the line to the data and then calculating r^2 and its p-value to evaluate how well that line fits the data.
@Slayer1407-d9d 11 หลายเดือนก่อน
@@statquest still confused.. as you said 'how well it fits the data', so the r2 and p value are tests for evaluation right? dont they have alternatives? or is it necessary to do exactly these steps. I'll still get a logistic regression model but it may not be the best one without them?
Or are you saying that these, or some other alternatives tests are necessary to do, to assess the model and this repeats iteratively until best fit?
@statquest 11 หลายเดือนก่อน ⁺¹
@@Slayer1407-d9d They do have alternatives, so, as you say, you might think of r^2 and its corresponding p-values as the 'industry standards'. Pretty much every program that offers a linear regression function will give you those as outputs. However, there are alternatives, and you can read more about them here: developer.nvidia.com/blog/a-comprehensive-overview-of-regression-evaluation-metrics/ among other places.
@Slayer1407-d9d 11 หลายเดือนก่อน
@@statquest Thanks a lot for clearing that
@yuji25290 14 ชั่วโมงที่ผ่านมา
So why divide SS(fit) by n - p-fit instead of just n?
@statquest 9 ชั่วโมงที่ผ่านมา ⁺¹
SS(fit) will be smaller for a more complex model to begin with, so we need to compensate for the complexity of the model.
@yuji25290 6 ชั่วโมงที่ผ่านมา
It's not intuitive for me to understand. I think I need an example to get there. So, it's basically making SS(fit) a smaller value? Btw I also saw something like n - pfit - 1. What is the -1 for? Thank you for your response even though this is a two years old video!
@statquest 6 ชั่วโมงที่ผ่านมา
@@yuji25290 Unfortunately, the best I can do at this point is at 22:42. For the equations that include a "-1", this probably has to do with the fact that these are all estimates. To learn more about the -1, see: th-cam.com/video/sHRBg6BhKjI/w-d-xo.html
@НиколайНовичков-е1э 2 ปีที่แล้ว ⁺¹
Thank you :)
@statquest 2 ปีที่แล้ว
You're welcome!
@sopeadaralegbe8077 ปีที่แล้ว
is residual the difference between the observed value of the dependent variable and the predicted value or the difference between the overall mean of the dependent and the observed value
@statquest ปีที่แล้ว
The residual is the difference between the observed and predicted values.
@stevinbrat 11 หลายเดือนก่อน ⁺¹
you are a genius!
@statquest 11 หลายเดือนก่อน
Thanks!
@alabenmed4661 2 ปีที่แล้ว
hello i love watcing your video they are entertaining and educaional but i saw some other videos of ways to determine intercept and slope of a line
im wondering if you have a video about that or is there a better approach ?
@statquest 2 ปีที่แล้ว ⁺¹
There are a number of ways to do it. One is to use an analytical solution. Take the derivatives of the equation with respect to the different variables (in this case, the slope and the intercept) and then solve for when those derivatives are equal to 0. For linear regression, this is a fine way to solve the problem, but it only works in this one case. A more general solution is to use something called Gradient Descent. This works on regression problems and many, many more. For details about Gradient Descent, see: th-cam.com/video/sDv4f4s2SB8/w-d-xo.html
@alabenmed4661 2 ปีที่แล้ว ⁺¹
@@statquest thanks man have ag reat day
@adone1234 11 หลายเดือนก่อน
its like years since u uploaded this
@statquest 11 หลายเดือนก่อน
I know! This one is classic! It might even be "pre BAM!"
@atharvigupta4250 9 หลายเดือนก่อน
so is mouse size a confounder?
@statquest 9 หลายเดือนก่อน
What time point, minutes and seconds, are you asking about?
@hoanglexuan7861 11 หลายเดือนก่อน
can you do Quantile Regression?
@statquest 11 หลายเดือนก่อน
I'll keep that in mind.
@ritardstrength5169 29 วันที่ผ่านมา ⁺¹
18:30 Is F the F distribution?
@statquest 29 วันที่ผ่านมา
Yes
@vatanrangani8033 ปีที่แล้ว
so is R square , a correlation coefficient?
@statquest ปีที่แล้ว
It is the square of the correlation coefficient.
@mj-gp3lk 2 หลายเดือนก่อน
why p value needs to be small?
pls answer
@statquest 2 หลายเดือนก่อน
If the end of this video doesn't answer your question satisfactorily, please see: th-cam.com/video/vemZtEM63GY/w-d-xo.html and th-cam.com/video/JQc3yx0-Q9E/w-d-xo.html
@puneetkumarsingh1484 ปีที่แล้ว
Not that it matters here but the shouldn't the sample variance formula have n-1 instead of n?
@statquest ปีที่แล้ว
In this case it doesn't matter.
@adone1234 10 หลายเดือนก่อน ⁺¹
u just earned a subcriber
@statquest 10 หลายเดือนก่อน
bam! :)
@urahulk-l6k 6 หลายเดือนก่อน
Hello, I had one doubt. For calculating multiple F values, are we taking random samples from our original dataset itself? As in, if there are 100 data points in total, we will take 80, 70 and any random data points from 100 to plot F values on histogram? Could you please help me with this?
@statquest 6 หลายเดือนก่อน
The example where we use random data is just an example of the concepts behind how the p-value is calculated. In practice, we use a curve generated by the F distribution (see 25:26) that represents what would happen if we had generated an infinite number of random datasets.
@atharvigupta4250 9 หลายเดือนก่อน
how do you come with the equation
@statquest 9 หลายเดือนก่อน
What time point, minutes and seconds, are you asking about?
@dead_again9839 4 หลายเดือนก่อน ⁺¹
this is fucking fantastic
@statquest 4 หลายเดือนก่อน ⁺¹
:)
@sopeadaralegbe8077 ปีที่แล้ว
what's the difference between RSS and SS(fit) ?
@statquest ปีที่แล้ว
They are the same. However, I changed notation so that I could specify when which model we were using to make the predictions. SS(fit) is the RSS around the fitted line and the SS(mean) is the RSS around the mean.
@prachirahate1631 10 หลายเดือนก่อน ⁺¹
awesoommeeeeee!
@statquest 10 หลายเดือนก่อน
Thanks!
@dklc28 หลายเดือนก่อน
it was great at first and then i lost track after 19 minute mark
@statquest หลายเดือนก่อน
Sorry to hear that
@exarchoskanelis84 2 ปีที่แล้ว ⁺¹
Legend
@statquest 2 ปีที่แล้ว
Thanks!

ต่อไป

เล่นอัตโนมัติ

Learn Statistical Regression in 40 mins! My best video ever. Legit.