Hierarchical Cluster Analysis [Simply explained]
ฝัง
- เผยแพร่เมื่อ 23 พ.ค. 2024
- What is Hierarchical Cluster Analysis? And how is it calculated?
A hierarchical cluster analysis is a clustering method that creates a hierarchical tree of objects to be clustered (Dendrogram). The tree represents the relationships between the objects and shows how the objects are clustered at different levels.
► Load sample data set
datatab.net/statistics-calcul...
► Online Calculator Hierarchical Cluster Analysis
datatab.net/statistics-calcul...
► Hierarchical Cluster Analysis Tutorial
datatab.net/tutorial/hierarch...
► E-BOOK
datatab.net/statistics-book
00:00 What is Hierarchical Cluster Analysis?
00:31 Example of Hierarchical Cluster Analysis
00:50 Calculate hierarchical cluster analysis
06:32 Calculate hierarchical cluster analysis online
I loved learning about "Heyrakikal" clustering
You just made my evening with your simple explanation.
Glad it was helpful and many thanks for your feedback! Regards Hannah
I found it very understandable and simple. thanks a lot!
Beautifully explained, thanks! 🙏 Incredibly clear.
well explained thank you so much
thank you so much. you clarified a lot!!!!
😀
thank you so much, you have explained it so well
Glad it was helpful!
Great video, thank you!!!
My pleasure!
nicely explained
Great content. I'm a fan :)
Glad it was helpful and many thanks for your nice feedback! Regards Hannah
youre kinda cute
Pls endeavour to avoid making mistakes thanks for comment section i could have got it so difficult to comprehend. That aspect of sqrt of 17 is terrible. But u did well and this video is good too
Hi thanks for youre feedback! We try to avoid mistakes, sorry for that and for the resulting trouble! Regards, Hannah
well, that's because it's the sqrt of 10 not sqrt of 17. The mistake was using 4 instead of 3
How do you name the clusters? Just from left to right, so cluster 1, cluster 2, cluster 3. Or are there more methods to name a cluster?
Real good!
How might be the sqr of 17 (16+1) =equal to 3,162 ? it must be 4,123 is not?
The error is that the x distance is 3 (from 1 to 4) not 4, so it’s the sq root of 10.
I would like to TH-cam tutorials like this. Do you have recommendations on what softwares to use?
DATAtab : )
@DATAtab where can you learn more about it?
The Euclidean distance horizontal component at 2:17 should be 3 not 4 since 4 - 1 = 3. Also, the manhattan distance should be 4 and the maximum distance should be 3 for the same reason.
I agree they are wrong, but shouldn't it be square root of 17, which is 4.12?
Because you go 3 steps to the right and 1 up; so sqrt(3^2 + 1^2)@@playbros332
How you calculate the distances between Lisa, Joe with the others?? you have a group of positions not just one... how do you do that? thankss!
Hi, in this case you would first calcualte the center between Lisa and Joe and then the diestance from this center to one other Person. Regards Hannah
I would like to ask, is Hierarchical Cluster Analysis always associated with the Euclidean Distance? Thank you
Hi many thanks for your question, Hierarchical Cluster Analysis (HCA) is not always associated with the Euclidean distance. While Euclidean distance is commonly used, HCA can work with various distance metrics depending on the nature of the data and the analysis goals.
Here are some common distance metrics used in HCA:
- Euclidean Distance: This is the straight-line distance between two points in a multi-dimensional space. It's one of the simplest and most widely used distance metrics.
- Manhattan Distance (also known as City Block or L1 distance): This is the sum of absolute differences between coordinates. It can be suitable when diagonal movement isn't meaningful.
- Cosine Similarity: This measures the cosine of the angle between two vectors, commonly used in text analysis and other contexts where vector magnitude might vary.
- Mahalanobis Distance: It accounts for correlations in data by incorporating the covariance matrix, making it suitable for data with different scales and correlations among variables.
- Minkowski Distance: A generalization of Euclidean and Manhattan distances, with a parameter 'p' to control the degree of the norm.
- Correlation-based Distance: This distance uses the correlation between data points rather than absolute differences. It's common in gene expression analysis or other contexts where relationships between variables matter more than absolute values.
I hope this was helpful : ) Regards Hannah
Nice video!
I want to know the name of algorithm that you have used here to explain hierarchical clustering.
I want too, but it is hight probable that she does not tell us. Statistics saying.
@@Nothingimportant1 AGNES
hi where can i find the elbo method
Oh sorry, it will be there soon!!!
i love your accent
: )
klaaastarrrrss
4-1=3 though!
: )
Claaaastars 😂
Excellent explanation. Why it takes too long to create a new video?
Good question! : ) We need almost two weeks to prepare the topic and to create the slides! Regards Hannah
@@datatab i hope it will be fast :)
is and not und at 3:15
Thanks : )
i think you have a mistakes with calculating