Spectral Clustering - Algorithms

Algorithms

Given a set of data points A, the similarity matrix may be defined as a matrix, where represents a measure of the similarity between points .

One spectral clustering technique is the normalized cuts algorithm or Shi–Malik algorithm introduced by Jianbo Shi and Jitendra Malik, commonly used for image segmentation. It partitions points into two sets based on the eigenvector corresponding to the second-smallest eigenvalue of the normalized Laplacian matrix

of, where is the diagonal matrix

This partitioning may be done in various ways, such as by taking the median of the components in, and placing all points whose component in is greater than in, and the rest in . The algorithm can be used for hierarchical clustering by repeatedly partitioning the subsets in this fashion.

A related algorithm is the Meila–Shi algorithm, which takes the eigenvectors corresponding to the k largest eigenvalues of the matrix for some k, and then invokes another algorithm (e.g. k-means clustering) to cluster points by their respective k components in these eigenvectors.

An efficiency improvement of spectral clustering is the spectral neighborhood (SPAN) algorithm, which performs spectral clustering without explicitly computing the similarity matrix, and therefore dramatically improves the scalability of the standard spectral clustering algorithm.

Read more about this topic:  Spectral Clustering