A Bluffer's Guide to Dimension Reduction - Leland McInnes

Leland McInnes, John Healy | Clustering: A Guide for the Perplexed

Clustering with DBSCAN, Clearly Explained!!!

หนูขอไปด้วย #แม่สุซูกัส #ตลก #shorts

#อึ้ง!เหลือจะเชื่อ!ไทยพลิกนรกดับสิงคโปร์คาบ้าน ทะลุเข้ารอบรองชนะเลิศ! คารวะอิชิอิโคตรการเปลี่ยนแปลง!

เดี่ยว - วันที่ได้คำตอบ - Live Show - The Voice Thailand 2024 - 15 Dec 2024

HDBSCAN, Fast Density Based Clustering, the How and the Why - John Healy

PyData

มุมมอง 61 967

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 6 ก.พ. 2025

ความคิดเห็น • 40

@benhurrodriguez1807 9 หลายเดือนก่อน ⁺¹²
Presentation Skills: 100000/10
@RajatSaxena35 2 ปีที่แล้ว ⁺¹⁶
Presentation Skills: 10/10
@stanleykurniawan3053 2 หลายเดือนก่อน ⁺¹
+ 1000 aura
@reocam8918 3 ปีที่แล้ว ⁺⁴
Nice presentation, I see 200% confidence and eloquence
@alexanderdevaux661 3 ปีที่แล้ว ⁺¹²
this is exactly what I have been looking for! great presentation.
@-beee- ปีที่แล้ว ⁺¹
Wow, what a great talk! Love the intuitive explanations and visuals. Super helpful. Thank you!
@21rufus21 ปีที่แล้ว ⁺³
Absolutely fantastic presentation, thank you
@vunder8737 4 หลายเดือนก่อน
This truly was a wonderful presenter, would love to listen to him on other presentations
@MrRaisin56 2 ปีที่แล้ว ⁺³
Wow I love the enthusiasm! It really makes it so much nicer to watch. Very insightful as well thank you very much!
@jiayangcheng 4 หลายเดือนก่อน
Love the presentation. Great work!
@alaaelhadba7310 ปีที่แล้ว ⁺¹
Thank you so much. It was exactly what I was looking for 🎉🎉
@hannahnelson4569 8 หลายเดือนก่อน ⁺¹
A very impressive presentation and algorithm! Thank you for teaching all this!
@pankajgoikar4158 2 ปีที่แล้ว ⁺²
Awesome presentation.
@opelfrost 2 หลายเดือนก่อน
thanks a lot, learn a lot from this presentation
@vampierkill 2 ปีที่แล้ว ⁺⁴
Sorry has to comment because of the kiiiiiiick ass animation! Brilliant.
@honey-py9pj 2 ปีที่แล้ว ⁺¹
what an amazing speaker!
@edwardmalthouse973 หลายเดือนก่อน
Thank you for your presentation. It was very helpful. I'm not sure about the claim that k-means requires small amounts of data. I believe K-means is O(n) (assuming a small number of dimensions and iterations) and I have used on very large data sets without problems.
I would also like to respectfully push back on the spherical cow comment. While it certainly depends on the domain, in social science and business applications with large, noisy data sets, the spherical, or at least elliptical, assumption often works very well, and produces better assumptions than the more nonparametric algorithms. It's easy to construct mathematical examples with odd-shaped clusters, but I've not encountered them in practice, although it could just be due to the domains I work in.
@daisyondwari9795 หลายเดือนก่อน
👀
@danaizenberg2402 ปีที่แล้ว ⁺²
great talk
@sushilkhadka-iu3gf ปีที่แล้ว ⁺¹
that was a great talk!
@valeryzuev3957 3 ปีที่แล้ว ⁺⁴
15:30 there might be a misprint in the formula: d(X_i, X_j), not d(X_j, X_j)
@nihshrey ปีที่แล้ว ⁺¹
Amazing
@TrixieFromSanFran 2 ปีที่แล้ว
The coloring of the tree at 14:00 is needlessly confusing. See figure 3a in their paper McInnes & Healy 2017 to clarify things
@maximillianweil2672 ปีที่แล้ว ⁺³
Thank you for the super interesting talk! I was wondering if you have worked with the new HDBSCAN integrated in sklearn 1.3.0? Is it possible to draw the cluster tree with this implementation?
@RoulDukeGonzo 7 หลายเดือนก่อน
Any luck?
@ahmedayman2380 ปีที่แล้ว ⁺²
can someone tell me about his linkedin or his full name please or how to connect to him
@RoulDukeGonzo 7 หลายเดือนก่อน
0:24 name and email
@RoulDukeGonzo 7 หลายเดือนก่อน
Any idea why the GPU version of this method can't take a pre-computed distance matrix?
@scatteredvideos1 5 หลายเดือนก่อน
There is a RAPIDS version of HDBScan. I'm personally struggling to get dependencies working together but it does exist
@RoulDukeGonzo 5 หลายเดือนก่อน
@@scatteredvideos1 I think that's what I used... Anyway, I'll give it another go.
@scatteredvideos1 5 หลายเดือนก่อน
To be honest the speed up really isn't even that great, it's only partially parallelized with GPUs. It's better just to reduce the dimensionality of your data, PCA to 95% of explained variance, and then UMAP to 10 or so dims, then cluster using HDBSCAN. I've found doing a grid search over a bunch of different HDBscan parameters can be helpful if you aren't getting perfect clustering.
@scatteredvideos1 5 หลายเดือนก่อน ⁺¹
With 10 UMAP dims and 184k data points my cluster is done in about 7 s on a Google colab high ram CPU instance
@RoulDukeGonzo 5 หลายเดือนก่อน ⁺¹
@@scatteredvideos1 I haven't tried GPU accelerated HDBSCAN, but for other clustering algorithms, the difference between CPU and GPU is night and day (so I was expecting it to be so here). I'm clustering embedding data from LLMs so it's extremely dense and uncorrelated, so PCA hasn't been much use (at least in my hands).
@pahulhallan 2 ปีที่แล้ว
27:50 Installation
@0MVR_0 8 หลายเดือนก่อน
clustering is highly driven by the formatting of how the data relates to itself
and is near impossible to accomplish using a single method of approach.
@RoulDukeGonzo 7 หลายเดือนก่อน
Agree, but in practical terms, where do you start?
@0MVR_0 7 หลายเดือนก่อน ⁺¹
@@RoulDukeGonzo An intimate descriptive knowledge of the data is recommended.
@laughingsaeed 3 หลายเดือนก่อน
I don't why he's talking so fast! Is someone after him and he needs to run away?!

ต่อไป

เล่นอัตโนมัติ

A Bluffer's Guide to Dimension Reduction - Leland McInnes

A Bluffer's Guide to Dimension Reduction - Leland McInnes

Leland McInnes, John Healy | Clustering: A Guide for the Perplexed

Leland McInnes, John Healy | Clustering: A Guide for the Perplexed

Clustering with DBSCAN, Clearly Explained!!!

Clustering with DBSCAN, Clearly Explained!!!

หนูขอไปด้วย #แม่สุซูกัส #ตลก #shorts

หนูขอไปด้วย #แม่สุซูกัส #ตลก #shorts

#อึ้ง!เหลือจะเชื่อ!ไทยพลิกนรกดับสิงคโปร์คาบ้าน ทะลุเข้ารอบรองชนะเลิศ! คารวะอิชิอิโคตรการเปลี่ยนแปลง!

#อึ้ง!เหลือจะเชื่อ!ไทยพลิกนรกดับสิงคโปร์คาบ้าน ทะลุเข้ารอบรองชนะเลิศ! คารวะอิชิอิโคตรการเปลี่ยนแปลง!

เดี่ยว - วันที่ได้คำตอบ - Live Show - The Voice Thailand 2024 - 15 Dec 2024

เดี่ยว - วันที่ได้คำตอบ - Live Show - The Voice Thailand 2024 - 15 Dec 2024

ถ้าทาสไม่ขุดทอง แล้วทาสจะขุดอะไร #hererm #เกม #gaming

ถ้าทาสไม่ขุดทอง แล้วทาสจะขุดอะไร #hererm #เกม #gaming

UMAP Uniform Manifold Approximation and Projection for Dimension Reduction | SciPy 2018 |

UMAP Uniform Manifold Approximation and Projection for Dimension Reduction | SciPy 2018 |

High Quality, High Performance Clustering with HDBSCAN | SciPy 2016 | Leland McInnes

High Quality, High Performance Clustering with HDBSCAN | SciPy 2016 | Leland McInnes

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

UMAP Dimension Reduction, Main Ideas!!!

UMAP Dimension Reduction, Main Ideas!!!

Christian Hennig - Assessing the quality of a clustering

Christian Hennig - Assessing the quality of a clustering

Investigating the Periodic Table with Experiments - with Peter Wothers

Investigating the Periodic Table with Experiments - with Peter Wothers

Brian Kent: Density Based Clustering in Python

Brian Kent: Density Based Clustering in Python

The Inside Story of ChatGPT’s Astonishing Potential | Greg Brockman | TED

The Inside Story of ChatGPT’s Astonishing Potential | Greg Brockman | TED

Gaussian Processes

Gaussian Processes

#โด่งดัง!ญี่ปุ่นซูฮก บอลอาเซียนเร้าใจ!! โค๊ชสิงคโปร์พูดแบบนี้ถึงไทย!! มาเลย์ขอบคุณไทยที่ให้ชีวิต..?

#โด่งดัง!ญี่ปุ่นซูฮก บอลอาเซียนเร้าใจ!! โค๊ชสิงคโปร์พูดแบบนี้ถึงไทย!! มาเลย์ขอบคุณไทยที่ให้ชีวิต..?

Cat mode activated 🤣

Cat mode activated 🤣

ทัวร์สตรีมเมอร์ ROV ชิงเงินรางวัลรวม 25,000 บาท 8 ทีม : รอบ 8 ทีม

ทัวร์สตรีมเมอร์ ROV ชิงเงินรางวัลรวม 25,000 บาท 8 ทีม : รอบ 8 ทีม

เนื้อเรื่องที่ท่านจะโมโหจนน้ำตาไหล | Mouthwashing

เนื้อเรื่องที่ท่านจะโมโหจนน้ำตาไหล | Mouthwashing

Nec Red Rockets Kawasaki vs. LP Bank Ninh Binh - Pool B | Highlights | Club World Champs 2024

Nec Red Rockets Kawasaki vs. LP Bank Ninh Binh - Pool B | Highlights | Club World Champs 2024

HIGHLIGHTS : Singapore 2-4 Thailand | ASEAN Championship 2024 | 17.12.24

HIGHLIGHTS : Singapore 2-4 Thailand | ASEAN Championship 2024 | 17.12.24

ふわふわシフォン大作戦🩷スイーツ戦隊のキラキラミッション✨【銀座コージーコーナー】 #shorts #シフォンケーキ #クリスマスケーキ #クリスマス #ケーキ #チョコケーキ #christmas

ふわふわシフォン大作戦🩷スイーツ戦隊のキラキラミッション✨【銀座コージーコーナー】 #shorts #シフォンケーキ #クリスマスケーキ #クリスマス #ケーキ #チョコケーキ #christmas

BABYMONSTER - 'Love In My Heart' M/V

BABYMONSTER - 'Love In My Heart' M/V