Multiple Regression in R, Step by Step!!!

แชร์
ฝัง
  • เผยแพร่เมื่อ 17 พ.ย. 2022
  • This 'Quest starts with a simple regression in R and then shows how multiple regression can be used to determine which parameters are the most valuable. If you want the code, you can get it from the StatQuest GitHub, here: github.com/StatQuest/multiple...
    If you'd like to support StatQuest, please consider...
    Patreon: / statquest
    ...or...
    TH-cam Membership: / @statquest
    ...buy my book, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
    statquest.org/statquest-store/
    ...or just donating to StatQuest!
    www.paypal.me/statquest
    Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
    / joshuastarmer
    #StatQuest

ความคิดเห็น • 64

  • @statquest
    @statquest  ปีที่แล้ว +2

    Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

  • @smtxtv
    @smtxtv หลายเดือนก่อน +4

    I'm giving this a thumbs up...just on the intro !

    • @statquest
      @statquest  หลายเดือนก่อน

      bam! :)

  • @TT-eg1et
    @TT-eg1et ปีที่แล้ว +12

    I love your channel and your way of explaining things!
    Thank you

    • @statquest
      @statquest  ปีที่แล้ว +1

      Thank you! :)

  • @MotorhomeMarx
    @MotorhomeMarx หลายเดือนก่อน +1

    Excelent work. Double sworded, swiftly slaying the stats serpent from planet r

    • @statquest
      @statquest  หลายเดือนก่อน

      Bam! :)

  • @katielui131
    @katielui131 2 หลายเดือนก่อน +2

    This is super as always thanks

    • @statquest
      @statquest  2 หลายเดือนก่อน

      Thanks!

  • @francescomaura5581
    @francescomaura5581 ปีที่แล้ว +8

    great video! As usual, I should say :) what about diff-in-diff (with R example, possibly)?

    • @statquest
      @statquest  ปีที่แล้ว +2

      I'll keep that in mind.

  • @GetThePun
    @GetThePun ปีที่แล้ว +3

    always amazing!

  • @RedFeather11
    @RedFeather11 6 หลายเดือนก่อน +2

    Thanks a lot Sir. 🤩💐

    • @statquest
      @statquest  6 หลายเดือนก่อน

      Thanks!

  • @muss9306
    @muss9306 หลายเดือนก่อน +1

    wow amazing video , thank you so much

    • @statquest
      @statquest  หลายเดือนก่อน

      Thanks!

  • @tabarakalmosawi6659
    @tabarakalmosawi6659 2 หลายเดือนก่อน +1

    thank you very very much!!

    • @statquest
      @statquest  2 หลายเดือนก่อน

      BAM! :)

  • @hrk201
    @hrk201 ปีที่แล้ว +1

    Hey, thank you for your videos, it is really helpful, how do we conduct full inference for multiple linear regression model?

    • @statquest
      @statquest  ปีที่แล้ว

      I'm not sure I understand your question. Can you elaborate on it?

    • @hrk201
      @hrk201 ปีที่แล้ว

      @@statquest hey josh thanks for responding, I have done a linear regression model that describes the data best by using test based and criterion based model selection. I have been asked to "conduct full inference using the best fit model". I am slightly confused as to what needs to be done for this step, is it just the explanation of f-statistics and hypothesis testing obtained from summery of the model?

    • @statquest
      @statquest  ปีที่แล้ว +1

      @@hrk201 That would be my guess, but it's just a guess.

  • @NotMadeOfManitobaFlour
    @NotMadeOfManitobaFlour 2 หลายเดือนก่อน +1

    The r^2, adjusted r^2, and p-value look good;
    HOOOOOORAY!

    • @statquest
      @statquest  2 หลายเดือนก่อน

      bam!

  • @user-vy1oz8jz2t
    @user-vy1oz8jz2t 4 หลายเดือนก่อน

    I'm sorry if I'm asking a stupid question, why p-value of weight can tell us using weight and tail isn't significantly better than only tail. Thank you so much

    • @statquest
      @statquest  4 หลายเดือนก่อน +1

      First you need to understand linear regression: th-cam.com/video/nk2CQITm_eo/w-d-xo.html and then you can find the answer to your question in this video that describes the theory of multiple regression: th-cam.com/video/zITIFTsivN8/w-d-xo.html

    • @user-vy1oz8jz2t
      @user-vy1oz8jz2t 4 หลายเดือนก่อน +1

      @@statquest thank you so much Sir, I appreciate it a lot

  • @christopherwitt6283
    @christopherwitt6283 ปีที่แล้ว +2

    Please can you do a video on multi nominal logistic regression in R?

    • @statquest
      @statquest  ปีที่แล้ว

      I'll keep that in mind.

  • @jeffrisher6965
    @jeffrisher6965 4 หลายเดือนก่อน

    Sorry if this is a stupid question, but is there a good way to format the results table? For instance, if I wanted the beta to be rounded to 3 digits, and the t and p values to be rounded to 4 digits?

    • @statquest
      @statquest  4 หลายเดือนก่อน +1

      Good question! The only way I can think of doing it is drawing it yourself using the original values (in this case, they are stored in the variable "multiple.regression") and running them through the round() function.

    • @jeffrisher6965
      @jeffrisher6965 4 หลายเดือนก่อน

      @@statquest Thanks. I bought your machine learning book, but have not had a single minute to sit down and read any of it. Maybe in a couple months...

    • @statquest
      @statquest  4 หลายเดือนก่อน +1

      @@jeffrisher6965 Thank you for your support! I hope you enjoy the book when you have time to read it. :)

  • @faizahkhalid9468
    @faizahkhalid9468 6 หลายเดือนก่อน

    How do you know that the relationship between tail and weight? Is there any decision rules? I don't get how to conclude using r square and p values

    • @statquest
      @statquest  6 หลายเดือนก่อน

      A small p-value would cause us to reject the hypothesis that random noise generated the data. For details about p-values, see: th-cam.com/video/vemZtEM63GY/w-d-xo.html

  • @katherinechau5594
    @katherinechau5594 ปีที่แล้ว +1

    So when do you use MLR versus multidimensional scaling?

    • @statquest
      @statquest  ปีที่แล้ว

      Multidimensional Scaling is pretty different. To learn about it, first learn about PCA (only 5 minutes long: th-cam.com/video/HMOI_lkzW08/w-d-xo.html ) and then MDS: th-cam.com/video/GEn-_dAyYME/w-d-xo.html

  • @nichananwanchai9910
    @nichananwanchai9910 ปีที่แล้ว +1

    what is data mouse data i kinda confuse but your video really help me

  • @sighsha3657
    @sighsha3657 5 หลายเดือนก่อน +1

    why is the results of the tail predicting the linear regression of weight and vice versa?

    • @statquest
      @statquest  5 หลายเดือนก่อน

      What time point, minutes and seconds, are you asking about?

    • @sighsha3657
      @sighsha3657 5 หลายเดือนก่อน

      @@statquest starting at 6.13

    • @statquest
      @statquest  5 หลายเดือนก่อน +1

      @@sighsha3657 At that point we are testing how well we can predict "size" with and without specific variables in the model. So we see how well we can predict "size" with and without "weight" and we see how well we can predict "size" with and without "tail length". These tests help us asses how useful it is to use "weight" or "tail length" to predict "size". A small p-value suggests that a variable is useful.

    • @yourube4367
      @yourube4367 11 วันที่ผ่านมา

      Yeah, I'm struggling with this bit too. It feels like the coefficient line for 'weight' should be comparing the full model against a model with just weight as a predictor, but the explanation suggests the full model is being compared to a model with only tail length as a predictor.

  • @LuisSantiago-xo4fm
    @LuisSantiago-xo4fm ปีที่แล้ว +1

    What if the relationship between Y and one of the Xs is not linear?

    • @statquest
      @statquest  ปีที่แล้ว

      Then you might need to use a different method.

    • @LuisSantiago-xo4fm
      @LuisSantiago-xo4fm ปีที่แล้ว

      Is there any video of yours on that? This is actually a matter that gets me a bit confused 😅

    • @statquest
      @statquest  ปีที่แล้ว +1

      @@LuisSantiago-xo4fm When the relationship is non-linear, you can try regression trees: th-cam.com/video/_L39rN6gz7Y/w-d-xo.html and th-cam.com/video/g9c66TUylZ4/w-d-xo.html

  • @Cheese_Coffee_
    @Cheese_Coffee_ 5 หลายเดือนก่อน +1

    StatQuest is TOTES CRAY CRAY🤣

    • @statquest
      @statquest  5 หลายเดือนก่อน

      Totes! :)

  • @danielcontreras3744
    @danielcontreras3744 2 หลายเดือนก่อน +1

    the best

    • @statquest
      @statquest  2 หลายเดือนก่อน

      Thanks!

  • @montahatfifha4389
    @montahatfifha4389 ปีที่แล้ว +1

    hey! should i perform any tests beforehand ? or not ?

    • @montahatfifha4389
      @montahatfifha4389 ปีที่แล้ว

      is it better to perform this model with R or python ? and is it okay to have 20 observation per variable ?

    • @statquest
      @statquest  ปีที่แล้ว +1

      It depends on what you mean by tests. However, usually multiple regression fits the model and then tests each variable as described. So this would be regression first, tests second.

    • @statquest
      @statquest  ปีที่แล้ว

      20 observations per variable is find. And it's up to you if you want to use R or Python.

    • @montahatfifha4389
      @montahatfifha4389 ปีที่แล้ว

      @@statquest um i thought i should perform the multicollinearity and heteroscedasticity and stationarity and do any correction before proceeding to fitting data?!

    • @montahatfifha4389
      @montahatfifha4389 ปีที่แล้ว

      @@statquest can I contact you please on a more practical platform I have some confusions ://

  • @dawmi3140
    @dawmi3140 6 หลายเดือนก่อน

    how to transform large data to be like the smaller values in teh video?

    • @statquest
      @statquest  6 หลายเดือนก่อน

      What time point, minutes and seconds, are you asking about?

  • @miles6939
    @miles6939 ปีที่แล้ว +1

    Matlab?

    • @statquest
      @statquest  ปีที่แล้ว +2

      Maybe one day!