Machine Learning

Unsupervised Learning / K-Medoids Clustering


3. K-Medoids Clustering

K-Medoids (also known as PAM — Partitioning Around Medoids) is a clustering algorithm similar to K-Means. However, instead of computing the mathematical mean as the cluster center, it uses an ACTUAL DATA POINT from the dataset as the center, known as a medoid.

The PAM Algorithm

  1. INITIALIZATION: Randomly select k actual data points as initial medoids: m₁, m₂, ..., mₖ
  2. ASSIGNMENT: Assign each data point to the nearest medoid based on a distance metric (e.g., Manhattan distance).
  3. SWAP STEP: For each medoid mᵢ and each non-medoid point xⱼ:
    • Compute the total cost (sum of distances) of swapping mᵢ with xⱼ.
    • If the new total cost is less than the current cost, perform the swap (xⱼ becomes the new medoid).
  4. CONVERGENCE: Repeat Steps 2 and 3 until no swap reduces the total cost.

Worked Example (k=2)

Dataset: 1D points {1, 2, 4, 5, 6, 7}

  1. Initial Medoids: Select m1=1, m2=6.
  2. Assignment:
    • Cluster 1 (closest to 1): {1, 2}
    • Cluster 2 (closest to 6): {4, 5, 6, 7}
  3. Current Cost (Absolute Difference):
    |1-1| + |2-1| + |4-6| + |5-6| + |6-6| + |7-6| = 0 + 1 + 2 + 1 + 0 + 1 = 5
  4. Try Swap (m1=1 with point 4):
    • New Medoids: m1=4, m2=6
    • Clusters: Cluster 1={1, 2, 4}, Cluster 2={5, 6, 7}
    • New Cost: |1-4| + |2-4| + |4-4| + |5-6| + |6-6| + |7-6| = 3 + 2 + 0 + 1 + 0 + 1 = 7 (Worse!)
  5. Conclusion: No swap improves the cost of 5. The algorithm converges with final clusters: {1, 2} and {4, 5, 6, 7}.

When to Use K-Medoids vs K-Means?

K-Medoids is significantly more robust to outliers. Because K-Means uses the mathematical average, an extreme outlier (e.g., an income of ₹10 crore) will drag the centroid far away from the true cluster center. K-Medoids is forced to pick an actual data point, completely mitigating this effect. However, K-Medoids is computationally more expensive (O(k(n-k)²)) than K-Means.