VC Dimension

In statistical learning theory, or sometimes computational learning theory, the VC dimension (for Vapnik–Chervonenkis dimension) is a measure of the capacity of a statistical classification algorithm, defined as the cardinality of the largest set of points that the algorithm can shatter. It is a core concept in Vapnik–Chervonenkis theory, and was originally defined by Vladimir Vapnik and Alexey Chervonenkis.

Informally, the capacity of a classification model is related to how complicated it can be. For example, consider the thresholding of a high-degree polynomial: if the polynomial evaluates above zero, that point is classified as positive, otherwise as negative. A high-degree polynomial can be wiggly, so it can fit a given set of training points well. But one can expect that the classifier will make errors on other points, because it is too wiggly. Such a polynomial has a high capacity. A much simpler alternative is to threshold a linear function. This function may not fit the training set well, because it has a low capacity. We make this notion of capacity more rigorous below.

Read more about VC Dimension:  Shattering, Uses

Famous quotes containing the word dimension:

    Le Corbusier was the sort of relentlessly rational intellectual that only France loves wholeheartedly, the logician who flies higher and higher in ever-decreasing circles until, with one last, utterly inevitable induction, he disappears up his own fundamental aperture and emerges in the fourth dimension as a needle-thin umber bird.
    Tom Wolfe (b. 1931)