Color Histogram - Drawbacks and Other Approaches

Drawbacks and Other Approaches

The main drawback of histograms for classification is that the representation is dependent of the color of the object being studied, ignoring its shape and texture. Color histograms can potentially be identical for two images with different object content which happens to share color information. Conversely, without spatial or shape information, similar objects of different color may be indistinguishable based solely on color histogram comparisons. There is no way to distinguish a red and white cup from a red and white plate. Put another way, histogram-based algorithms have no concept of a generic 'cup', and a model of a red and white cup is no use when given an otherwise identical blue and white cup. Another problem is that color histograms have high sensitivity to noisy interference such as lighting intensity changes and quantization errors. High dimensionality (bins) color histograms are also another issue. Some color histogram feature spaces often occupy more than one hundred dimensions.

Some of the proposed solutions have been color histogram intersection, color constant indexing, cumulative color histogram, quadratic distance, and color correlograms. Although there are drawbacks of using histograms for indexing and classification, using color in a real-time system has several advantages. One is that color information is faster to compute compared to other invariants. It has been shown in some cases that color can be an efficient method for identifying objects of known location and appearance.

Further research into the relationship between color histogram data to the physical properties of the objects in an image has shown they can represent not only object color and illumination but relate to surface roughness and image geometry and provide an improved estimate of illumination and object color.

Usually, Euclidean distance, histogram intersection, or cosine or quadratic distances are used for the calculation of image similarity ratings. Any of these values do not reflect the similarity rate of two images in itself; it is useful only when used in comparison to other similar values. This is the reason that all the practical implementations of content-based image retrieval must complete computation of all images from the database, and is the main disadvantage of these implementations.

Another approach to representative color image content is two-dimensional color histogram. A two-dimensional color histogram considers the relation between the pixel pair colors (not only the lighting component). A two-dimensional color histogram is a two-dimensional array. The size of each dimension is the number of colors that were used in the phase of color quantization. These arrays are treated as matrices, each element of which stores a normalized count of pixel pairs, with each color corresponding to the index of an element in each pixel neighborhood. For comparison of two-dimensional color histograms it is suggested calculating their correlation, because constructed as described above, is a random vector (in other words, a multi-dimensional random value). While creating a set of final images, the images should be arranged in decreasing order of the correlation coefficient.

The correlation coefficient may also be used for color histogram comparison. Retrieval results with correlation coefficient are better than with other metrics.

Read more about this topic:  Color Histogram

Famous quotes containing the words drawbacks and/or approaches:

    France has neither winter nor summer nor morals—apart from these drawbacks it is a fine country.
    Mark Twain [Samuel Langhorne Clemens] (1835–1910)

    A politician is a statesman who approaches every question with an open mouth.
    Adlai Stevenson (1900–1965)