Random Forest in Machine Learning: Easy Explanation for Data Science Interviews

แชร์
ฝัง
  • เผยแพร่เมื่อ 30 มิ.ย. 2024
  • Random Forest is one of the most useful pragmatic algorithms for fast, simple, flexible predictive modeling. In this video, I dive into how Random Forest works, how you can use it to reduce variance, what makes it “random,” and the most common pros and cons associated with using this method.
    Variance of average of correlated random variables stats.stackexchange.com/quest...
    🟢Get all my free data science interview resources
    www.emmading.com/resources
    🟡 Product Case Interview Cheatsheet www.emmading.com/product-case...
    🟠 Statistics Interview Cheatsheet www.emmading.com/statistics-i...
    🟣 Behavioral Interview Cheatsheet www.emmading.com/behavioral-i...
    🔵 Data Science Resume Checklist www.emmading.com/data-science...
    ✅ We work with Experienced Data Scientists to help them land their next dream jobs. Apply now: www.emmading.com/coaching
    // Comment
    Got any questions? Something to add?
    Write a comment below to chat.
    // Let's connect on LinkedIn:
    / emmading001
    ====================
    Contents of this video:
    ====================
    00:00 Introduction
    01:09 What Is Random Forest?
    02:10 How Random Forest Works
    03:53 Why Is Random Forest Random?
    04:20 Random Forest vs. Bagging
    04:57 Hyperparameters
    06:18 Variance Reduction
    09:04 Pros and Cons of Random Forest

ความคิดเห็น • 24

  • @user-hc4bo5mn4j
    @user-hc4bo5mn4j ปีที่แล้ว +1

    Very clear explaination! Thank you so much!

  • @alanzhu7538
    @alanzhu7538 ปีที่แล้ว

    Keep up the awesome work!, Emma I watched your video one year ago and I got a data science job. Now I start to forget some ML models that I don't use often, it is a very good way to refresh my memory on them!!!

  • @evag3014
    @evag3014 ปีที่แล้ว

    Looking forward to the notes!! Thanks for sharing, Emma!!!

  • @yuegao5575
    @yuegao5575 ปีที่แล้ว

    Great Video! Thanks for making it. One minor comment is that at 6:56, sigma^2/k is actually not from CLT, essentially it's just from the basic property of variance.

  • @shilpamandal7232
    @shilpamandal7232 ปีที่แล้ว

    Awesome video. Super helpful.

  • @emma_ding
    @emma_ding  ปีที่แล้ว

    Many of you have asked me to share my presentation notes, and now… I have them for you! Download all the PDFs of my Notion pages at www.emmading.com/get-all-my-free-resources. Enjoy!

  • @tinbluu7653
    @tinbluu7653 ปีที่แล้ว

    Love it!

  • @raghu_teja4683
    @raghu_teja4683 ปีที่แล้ว +1

    Nice lecture, can we get the resource you used. It will be very helpful.

  • @yungetong634
    @yungetong634 ปีที่แล้ว

    great video!

    • @emma_ding
      @emma_ding  ปีที่แล้ว

      Thanks for the kind comment, Yunge! 😊

  • @Doctor_monk
    @Doctor_monk ปีที่แล้ว +3

    Would you mind sharing the notion page with us? Would really appreciate it. :)

    • @emma_ding
      @emma_ding  ปีที่แล้ว +1

      Of course! I'm working on getting all notes organized and sharable in one location, will let you know as soon as they are ready! :)

  • @ayuumi7926
    @ayuumi7926 ปีที่แล้ว

    A very helpful video on RF. Hi Emma, would you mind actually making a video on how to go about mastering new ML concepts from zero to hero?

    • @emma_ding
      @emma_ding  ปีที่แล้ว

      Thanks for the suggestion, Ayuumi! I'll add it to my list of video ideas. 😊

  • @shawnkim6287
    @shawnkim6287 ปีที่แล้ว

    Hi Emma. Thanks for the video. Have a question. I am not sure about how this statement is true. "random forest constructs a large number of trees with random bootstrap samples from the training data". If sample size = replacement, we have all observations in every bootstrap sample. Then, it's not random bootstrap samples. Can you please elaborate what that line is saying?

  • @AllieZhao
    @AllieZhao ปีที่แล้ว

    Very clear and well structured

    • @emma_ding
      @emma_ding  ปีที่แล้ว

      Thanks, Allie! Glad you found it helpful. 😊

  • @davidskarbrevik
    @davidskarbrevik ปีที่แล้ว +1

    Can you clarify how the random feature subset selection happens "without replacement"? Is it that e.g. we have 20 features and tree 1 takes 10 features, tree 2 takes the remaining10 features and now tree 3 can take 10 from the original 20?

    • @paoloesquivel7430
      @paoloesquivel7430 3 หลายเดือนก่อน

      No. It means any tree in the forest has no duplicate features.

  • @emmafan713
    @emmafan713 ปีที่แล้ว

    thanks!!1

  • @1386imran
    @1386imran ปีที่แล้ว +1

    What happens if RF n_estimators(individual decision trees) have conflicting outcome as in 50% of them voted/predicted class A while the other 50% voted/predicted class B.
    In this situation, what would be the final outcome??

    • @davidskarbrevik
      @davidskarbrevik ปีที่แล้ว +1

      Up to your logic at that point. But if that is a common occurrence in your model, perhaps try increasing the number of estimators.

  • @shubhamkaushik285
    @shubhamkaushik285 ปีที่แล้ว

    can we say if interview ask which algorithm can be used here , and we don't know the Ans we can surely apply random forest here.🤔😜