Image Segmentation - K-Means Clustering

5 minute read

Published:

Contents

  • K-means (pixel-wise) segmentation
  • Chan-Vese (region-wise) segmentation

K-Means Segmentation

K-Means is a well-known algorithm for clustering, most often applied to tabular data in unsupervised learning. However, an interesting application is pixel-wise Image Segmentation.

The method involves dividing the image into \(k\) clusters based on pixel intensity values. Each pixel is assigned a label corresponding to its cluster. This process iteratively minimizes the within-cluster variance.

Mathematical Explanation

K-Means clustering seeks to minimize the following objective function: \(J = \sum_{i=1}^{k} \sum_{x \in C_i} \ \vert x - \mu_i\ \vert ^2,\) where:

  • \(k\) is the number of clusters,
  • \(C_i\) is the set of pixels in the \(i\)-th cluster,
  • \(\mu_i\) is the mean intensity value of the \(i\)-th cluster,
  • \(\ \vert x - \mu_i\ \vert ^2\) represents the squared Euclidean distance between a pixel \(x\) and the cluster mean.

The algorithm alternates between two steps:

  1. Assignment Step: Assign each pixel to the nearest cluster mean: \(c(x) = \arg\min_{i} \ \vert x - \mu_i\ \vert ^2.\)
  2. Update Step: Recompute the cluster means based on the new assignments: \(\mu_i = \frac{1}{ \vert C_i \vert } \sum_{x \in C_i} x.\) The algorithm terminates when cluster assignments no longer change.

Implementation - K-Means Segmentation

from __future__ import annotations
import os
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import cv2
from sklearn.datasets import load_sample_image
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_sample_image

def k_means_clustering(f, k):
  if k is None:
    k = np.random.randint(0, 10)

  rows, cols = f.shape # image shape
  labels = np.random.randint(0, k, (rows, cols))  # Random initial labels
  cluster_centers = np.zeros(k)

  while True:
    # compute cluster centers
    for i in range(k):
      cluster_centers[i] = np.mean(f[labels == i])

    # assign pixels to the closest cluster center
    new_labels = np.zeros((rows, cols), dtype=int)
    for i in range(rows):
      for j in range(cols):
        new_labels[i, j] = np.argmin(np.abs(cluster_centers - f[i, j]))

    # check for convergence
    if np.array_equal(new_labels, labels):
      break

    labels = new_labels

  return labels

def run_kmeans_segmentation(image, k=3):
  grayscale_image = np.mean(image, axis=2)  # Convert to grayscale

  labels = k_means_clustering(grayscale_image, k)

  # Display the original image and the segmented image
  plt.figure(figsize=(10, 5))
  plt.subplot(1, 2, 1)
  plt.imshow(grayscale_image, cmap='gray')
  plt.title('Original Image')

  plt.subplot(1, 2, 2)
  plt.imshow(labels, cmap='gray')
  plt.title('K-means Segmented Image')
  plt.show()

Ensemble K-Means Segmentation

In cases where K-Means clustering may be sensitive to initialization, an ensemble method can stabilize results. The ensemble approach runs the K-Means algorithm multiple times and combines the results using majority voting for each pixel.

Implementation

def k_means_ensemble(f, k, repetitions):
    # Input:
    # f - grayscale image matrix
    # k - number of clusters
    # repetitions - number of k-means runs
    # Output:
    # labels - matrix with the same size as f containing the combined cluster labels

    # Initialize variables
    rows, cols = f.shape
    votes = np.zeros((rows, cols, repetitions), dtype=int)

    # Run k-means clustering multiple times
    for r in range(1, repetitions+1):
        print('repetion: {}/{} ({:.3f}%)'.format(r, repetitions,
                                               r*100/repetitions))
        votes[:, :, r-1] = k_means_clustering(f, k)

    # Combine the results using majority voting
    labels = np.zeros((rows, cols), dtype=int)
    for i in range(rows):
        for j in range(cols):
            labels[i, j] = np.bincount(votes[i, j, :]).argmax()  # Majority voting

    return labels

def run_kmeans_ensemble(image,
                        k=3,
                        repetitions = 10):
  grayscale_image = np.mean(image, axis=2)  # Convert to grayscale

  ensemble_labels = k_means_ensemble(grayscale_image, k, repetitions)

  # Display the ensemble segmented image
  plt.figure()
  plt.imshow(ensemble_labels, cmap='gray')
  plt.title('Ensemble K-means Segmented Image')
  plt.show()

Chan-Vese Segmentation

A popular region-based segmentation method is the Chan-Vese model, which seeks to partition the image domain \(\Omega \subset \mathbb{R}^d\) into regions of approximately constant intensity. The domain is encoded by an indicator function \(\chi\) by approximating the given image f with constant functions \(c_1, c_2 \in \mathbb{R}\), i.e.,

\begin{split} \min_{c_1,c_2, \chi \in BV(\Omega, {0,1})} J(c_1,c_2, \chi)= & \frac{\lambda_1}{2} \int_{\Omega} \chi \vert c_1-f(x) \vert ^2dx
& +\frac{\lambda_2}{2} \int_{\Omega} (1 - \chi) \vert c_2-f(x) \vert ^2dx & +\beta \vert \chi \vert _{BV} \end{split}

for which \(\chi: \Omega \to \{0, 1\}\) is the indicator function given by: \(\chi(x) = \begin{cases}1 & \text{if } x \in \Omega_1 \\ 0 & \text{, else} \end{cases}\)

As the model in (1) is non-convex, we use an alternating minimization scheme for which one of the subproblems is a minimal surface problem, i.e.,

\[\min_{c_1,c_2, \chi \in BV(\Omega, \{0,1\})} J(c_1,c_2, \chi) = \int_{\Omega} \chi(x) \cdot \frac{1}{2}(\lambda_1 \vert c_1-f(x) \vert ^2 - \lambda_2 \vert c_2-f(x) \vert ^2 ) dx +\beta \vert \chi \vert _{BV}\]

And note that \(g(x):=\frac{1}{2}( \lambda_1 \vert c_1-f(x) \vert ^2 - \lambda_2 \vert c_2-f(x) \vert ^2)dx\).

Note that this segmentation problem can be solved uniquely via a direct relationship to the ROF denoising model:

\[u^* = \arg\min_{u \in \Omega} J(u)\]

with \(J(u) = \frac{1}{2} \ \vert u-g\ \vert ^2_{L^2(\Omega)} + \beta \vert u \vert _{BV}\)

Subsequently, we perform thresholding on \(u^∗\) at \(t = 0\) to obtain the solution \(\chi\). This convexified version of the Chan-Vese model can be implemented as follows.

Implementation

import numpy as np
import matplotlib.pyplot as plt
from skimage import color, data, img_as_float
from skimage.restoration import denoise_tv_chambolle

def chan_vese(f, lambda1, lambda2, beta, max_iter=100, tol=1e-3):
  # initialize chi with a checkerboard pattern
  chi = np.zeros_like(f, dtype=np.bool_)
  chi[::2, ::2] = 1
  chi[1::2, 1::2] = 1

  for _ in range(max_iter):
    # update c1 and c2
    c1 = np.mean(f[chi])
    c2 = np.mean(f[~chi])

    # compute the auxiliary function g
    g = 0.5 * (lambda2 * (c2 - f)**2 - lambda1 * (c1 - f)**2)

    # solve the ROF denoising problem (using total variation denoising)
    u_star = denoise_tv_chambolle(g, weight=beta, multichannel=False)

    # update chi
    chi_new = u_star >= 0

    # check for convergence
    if np.linalg.norm(chi_new.astype(int) - chi.astype(int)) < tol:
      break

    chi = chi_new

    return chi

def run_chan_vese(image, lambda1=1, lambda2=1, beta=0.1):
  image = color.rgb2gray(image)
  image = img_as_float(image)

  chi = chan_vese(image, lambda1, lambda2, beta)

  # Display the original image and the segmented image
  plt.figure(figsize=(10, 5))
  plt.subplot(1, 2, 1)
  plt.imshow(image, cmap='gray')
  plt.title('Original Image')

  plt.subplot(1, 2, 2)
  plt.imshow(chi, cmap='gray')
  plt.title('Chan-Vese Segmented Image')
  plt.show()

run_chan_vese(data.astronaut())