Gene Chip Analysis - Hierarchical Clustering

Hierarchical Clustering

Hierarchical clustering is a statistical method for finding relatively homogeneous clusters. Hierarchical clustering consists of two separate phases. Initially, a distance matrix containing all the pairwise distances between the genes is calculated. Pearson’s correlation and Spearman’s correlation are often used as dissimilarity estimates, but other methods, like Manhattan distance or Euclidean distance, can also be applied. If the genes on a single chip are to be clustered, the Euclidean distance is the correct choice, since at least two chips are needed for calculation of any correlation measures. After calculation of the initial distance matrix, the hierarchical clustering algorithm either (A) joins iteratively the two closest clusters starting from single data points (agglomerative, bottom-up approach), or (B) partitions clusters iteratively starting from the complete set (divisive, top-down approach). After each step, a new distance matrix between the newly formed clusters and the other clusters is recalculated. Hierarchical cluster analysis methods include:

  • Single linkage (minimum method, nearest neighbor)
  • Complete linkage (maximum method, furthest neighbor)
  • Average linkage (UPGMA).

Read more about this topic:  Gene Chip Analysis

Famous quotes containing the word hierarchical:

    Authority is the spiritual dimension of power because it depends upon faith in a system of meaning that decrees the necessity of the hierarchical order and so provides for the unity of imperative control.
    Shoshana Zuboff (b. 1951)