Machine Learning

Unsupervised Learning / K-Medoids Clustering

3. K-Medoids Clustering

K-Medoids (also known as PAM — Partitioning Around Medoids) is a clustering algorithm similar to K-Means. However, instead of computing the mathematical mean as the cluster center, it uses an ACTUAL DATA POINT from the dataset as the center, known as a medoid.

The PAM Algorithm

INITIALIZATION: Randomly select k actual data points as initial medoids: m₁, m₂, ..., mₖ
ASSIGNMENT: Assign each data point to the nearest medoid based on a distance metric (e.g., Manhattan distance).
SWAP STEP: For each medoid mᵢ and each non-medoid point xⱼ:
- Compute the total cost (sum of distances) of swapping mᵢ with xⱼ.
- If the new total cost is less than the current cost, perform the swap (xⱼ becomes the new medoid).
CONVERGENCE: Repeat Steps 2 and 3 until no swap reduces the total cost.

Worked Example (k=2)

Dataset: 1D points {1, 2, 4, 5, 6, 7}

Initial Medoids: Select m1=1, m2=6.
Assignment:
- Cluster 1 (closest to 1): {1, 2}
- Cluster 2 (closest to 6): {4, 5, 6, 7}
Current Cost (Absolute Difference):
|1-1| + |2-1| + |4-6| + |5-6| + |6-6| + |7-6| = 0 + 1 + 2 + 1 + 0 + 1 = 5
Try Swap (m1=1 with point 4):
- New Medoids: m1=4, m2=6
- Clusters: Cluster 1={1, 2, 4}, Cluster 2={5, 6, 7}
- New Cost: |1-4| + |2-4| + |4-4| + |5-6| + |6-6| + |7-6| = 3 + 2 + 0 + 1 + 0 + 1 = 7 (Worse!)
Conclusion: No swap improves the cost of 5. The algorithm converges with final clusters: {1, 2} and {4, 5, 6, 7}.

When to Use K-Medoids vs K-Means?

K-Medoids is significantly more robust to outliers. Because K-Means uses the mathematical average, an extreme outlier (e.g., an income of ₹10 crore) will drag the centroid far away from the true cluster center. K-Medoids is forced to pick an actual data point, completely mitigating this effect. However, K-Medoids is computationally more expensive (O(k(n-k)²)) than K-Means.

K-Means Clustering Hierarchical Clustering