Regularization (mathematics) - Regularization in Statistics and Machine Learning

Regularization in Statistics and Machine Learning

In statistics and machine learning, regularization is used to prevent overfitting. Typical examples of regularization in statistical machine learning include ridge regression, lasso, and L2-norm in support vector machines.

Regularization methods are also used for model selection, where they work by implicitly or explicitly penalizing models based on the number of their parameters. For example, Bayesian learning methods make use of a prior probability that (usually) gives lower probability to more complex models. Well-known model selection techniques include the Akaike information criterion (AIC), minimum description length (MDL), and the Bayesian information criterion (BIC). Alternative methods of controlling overfitting not involving regularization include cross-validation.

Regularization can be used to fine tune model complexity using an augmented error function with cross-validation. The data sets used in complex models can produce a levelling-off of validation as complexity of the models increases. Training data sets errors decrease while the validation data set error remains constant. Regularization introduces a second factor which weights the penalty against more complex models with an increasing variance in the data errors. This gives an increasing penalty as model complexity increases.

Examples of applications of different methods of regularization to the linear model are:

Model Fit measure Entropy measure
AIC/BIC
Ridge regression
Lasso
Basis pursuit denoising
Rudin-Osher-Fatemi model (TV)
Potts model
RLAD
Dantzig Selector

A combination of the LASSO and ridge regression methods is elastic net regularization.

Read more about this topic:  Regularization (mathematics)

Famous quotes containing the words statistics, machine and/or learning:

    O for a man who is a man, and, as my neighbor says, has a bone in his back which you cannot pass your hand through! Our statistics are at fault: the population has been returned too large. How many men are there to a square thousand miles in this country? Hardly one.
    Henry David Thoreau (1817–1862)

    The white man regards the universe as a gigantic machine hurtling through time and space to its final destruction: individuals in it are but tiny organisms with private lives that lead to private deaths: personal power, success and fame are the absolute measures of values, the things to live for. This outlook on life divides the universe into a host of individual little entities which cannot help being in constant conflict thereby hastening the approach of the hour of their final destruction.
    Policy statement, 1944, of the Youth League of the African National Congress. pt. 2, ch. 4, Fatima Meer, Higher than Hope (1988)

    Young children learn in a different manner from that of older children and adults, yet we can teach them many things if we adapt our materials and mode of instruction to their level of ability. But we miseducate young children when we assume that their learning abilities are comparable to those of older children and that they can be taught with materials and with the same instructional procedures appropriate to school-age children.
    David Elkind (20th century)