Machine Learning
The concept of overfitting is important in machine learning. Usually a learning algorithm is trained using some set of training examples, i.e. exemplary situations for which the desired output is known. The learner is assumed to reach a state where it will also be able to predict the correct output for other examples, thus generalizing to situations not presented during training (based on its inductive bias). However, especially in cases where learning was performed too long or where training examples are rare, the learner may adjust to very specific random features of the training data, that have no causal relation to the target function. In this process of overfitting, the performance on the training examples still increases while the performance on unseen data becomes worse.
As a simple example, consider a database of retail purchases that includes the item bought, the purchaser, and the date and time of purchase. It's easy to construct a model that will fit the training set perfectly by using the date and time of purchase to predict the other attributes; but this model will not generalize at all to new data, because those past times will never occur again.
Generally, a learning algorithm is said to overfit relative to a simpler one if it is more accurate in fitting known data (hindsight) but less accurate in predicting new data (foresight). One can intuitively understand overfitting from the fact that information from all past experience can be divided into two groups: information that is relevant for the future and irrelevant information (“noise”). Everything else being equal, the more difficult a criterion is to predict (i.e., the higher its uncertainty), the more noise exists in past information that need to be ignored. The problem is determining which part to ignore. A learning algorithm that can reduce the chance of fitting noise is called robust.
Read more about this topic: Overfitting
Famous quotes containing the words machine and/or learning:
“The momentary charge at Balaklava, in obedience to a blundering command, proving what a perfect machine the soldier is, has, properly enough, been celebrated by a poet laureate; but the steady, and for the most part successful, charge of this man, for some years, against the legions of Slavery, in obedience to an infinitely higher command, is as much more memorable than that as an intelligent and conscientious man is superior to a machine. Do you think that that will go unsung?”
—Henry David Thoreau (18171862)
“Our goal as a parent is to give life to our childrens learningto instruct, to teach, to help them develop self-disciplinean ordering of the self from the inside, not imposition from the outside. Any technique that does not give life to a childs learning and leave a childs dignity intact cannot be called disciplineit is punishment, no matter what language it is clothed in.”
—Barbara Coloroso (20th century)