Basic Procedure
- Formulate the problem - select the variables to which you wish to apply the clustering technique
- Select a distance measure - various ways of computing distance:
- Squared Euclidean distance - the sum of the squared differences in value for each variable
- Manhattan distance - the sum of the absolute differences in value for any variable
- Chebyshev distance - the maximum absolute difference in values for any variable
- Mahalanobis (or correlation) distance - this measure uses the correlation coefficients between the observations and uses that as a measure to cluster them. This is an important measure since it is unit invariant (can figuratively compare apples to oranges)
- Select a clustering procedure (see below)
- Decide on the number of clusters
- Map and interpret clusters - draw conclusions - illustrative techniques like perceptual maps, icicle plots, and dendrograms are useful
- Assess reliability and validity - various methods:
- repeat analysis but use different distance measure
- repeat analysis but use different clustering technique
- split the data randomly into two halves and analyze each part separately
- repeat analysis several times, deleting one variable each time
- repeat analysis several times, using a different order each time
Read more about this topic: Cluster Analysis (in Marketing)
Famous quotes containing the word basic:
“Of course I lie to people. But I lie altruisticallyfor our mutual good. The lie is the basic building block of good manners. That may seem mildly shocking to a moralistbut then what isnt?”
—Quentin Crisp (b. 1908)