Machine Learning
Unsupervised Learning / Hierarchical Clustering
4. Hierarchical Clustering
Hierarchical Clustering creates a tree-like hierarchy of clusters called a Dendrogram. Unlike K-Means, you do not need to specify the number of clusters (k) in advance; you can cut the dendrogram at an appropriate height later to define your clusters.
A. Agglomerative (Bottom-Up)
- Start: n clusters (each point is its own cluster).
- Compute the pairwise distance matrix.
- Merge: Find the two closest clusters and merge them into a single new cluster.
- Update: Recompute the distances between the new cluster and all remaining clusters using a Linkage Method.
- Repeat: Steps 3 and 4 until all points merge into one single master cluster.
B. Divisive (Top-Down / DIANA)
- Start: 1 cluster containing ALL points.
- Split: Find the point with the greatest average dissimilarity to other cluster members. This point starts a new sub-cluster.
- Reassign points to the sub-cluster they are most similar to.
- Repeat: Until each point is in its own individual cluster (n clusters).
Linkage Methods (Measuring Inter-Cluster Distance)
1. Single Linkage (MIN)
Distance between the NEAREST pair of points (one from each cluster).
Creates long, chain-like clusters ("chaining effect"). Highly sensitive to noise/outliers.
2. Complete Linkage (MAX)
Distance between the FARTHEST pair of points.
Creates compact, ball-shaped clusters of roughly equal diameter. Good for well-separated data.
3. Average Linkage
The average of all pairwise distances between points in cluster A and cluster B.
4. Ward's Method
Merges the two clusters that result in the minimum increase in the total within-cluster variance.