Support Vector Machine - Nonlinear Classification

Nonlinear Classification

The original optimal hyperplane algorithm proposed by Vapnik in 1963 was a linear classifier. However, in 1992, Bernhard E. Boser, Isabelle M. Guyon and Vladimir N. Vapnik suggested a way to create nonlinear classifiers by applying the kernel trick (originally proposed by Aizerman et al.) to maximum-margin hyperplanes. The resulting algorithm is formally similar, except that every dot product is replaced by a nonlinear kernel function. This allows the algorithm to fit the maximum-margin hyperplane in a transformed feature space. The transformation may be nonlinear and the transformed space high dimensional; thus though the classifier is a hyperplane in the high-dimensional feature space, it may be nonlinear in the original input space.

If the kernel used is a Gaussian radial basis function, the corresponding feature space is a Hilbert space of infinite dimensions. Maximum margin classifiers are well regularized, so the infinite dimensions do not spoil the results. Some common kernels include:

  • Polynomial (homogeneous):
  • Polynomial (inhomogeneous):
  • Gaussian radial basis function:, for Sometimes parametrized using
  • Hyperbolic tangent:, for some (not every) and

The kernel is related to the transform by the equation . The value w is also in the transformed space, with Dot products with w for classification can again be computed by the kernel trick, i.e. . However, there does not in general exist a value w' such that

Read more about this topic:  Support Vector Machine