Isolation Forests: Identify Outliers in Data

แชร์
ฝัง
  • เผยแพร่เมื่อ 15 ธ.ค. 2024

ความคิดเห็น • 5

  • @Bentley642
    @Bentley642 8 หลายเดือนก่อน +1

    Great video, explained in a very intuitive way!

    • @elderresearch
      @elderresearch  8 หลายเดือนก่อน

      Glad you enjoyed the video!

  • @muslimahmukbang417
    @muslimahmukbang417 8 หลายเดือนก่อน +1

    how are you getting the numbers -0.05, 0.10 and so on?

    • @elderresearch
      @elderresearch  8 หลายเดือนก่อน

      Thanks for your question! Here's what Jericho had to say about how he got those numbers:
      Isolation forests use a large number of randomized attempts to separate the data and count how many cuts it takes in each attempt to separate each datapoint. From that collection of counts for each record, scores are calculated.
      Since this is not straightforward to show by hand, I used the scikit-learn Python package and the wine dataset to calculate the scores, limiting the wine dataset to flavonoids and malic acid features. Then I took some example points from the outer edges and one from the middle of the real results and illustrated them as closely as possible in the whiteboard example.
      ---
      Here are the links to the scikit-learn and Python resources:
      scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html
      archive.ics.uci.edu/dataset/109/wine

  • @elhairachmohamedlimam9640
    @elhairachmohamedlimam9640 ปีที่แล้ว

    Thank you a lot go ahead!