Ensemble Learning - Ensemble Theory

Ensemble Theory

An ensemble is itself a supervised learning algorithm, because it can be trained and then used to make predictions. The trained ensemble, therefore, represents a single hypothesis. This hypothesis, however, is not necessarily contained within the hypothesis space of the models from which it is built. Thus, ensembles can be shown to have more flexibility in the functions they can represent. This flexibility can, in theory, enable them to over-fit the training data more than a single model would, but in practice, some ensemble techniques (especially bagging) tend to reduce problems related to over-fitting of the training data.

Empirically, ensembles tend to yield better results when there is a significant diversity among the models . Many ensemble methods, therefore, seek to promote diversity among the models they combine. Although perhaps non-intuitive, more random algorithms (like random decision trees) can be used to produce a stronger ensemble than very deliberate algorithms (like entropy-reducing decision trees). Using a variety of strong learning algorithms, however, has been shown to be more effective than using techniques that attempt to dumb-down the models in order to promote diversity.

Read more about this topic: Ensemble Learning

Famous quotes containing the word theory:

“We commonly say that the rich man can speak the truth, can afford honesty, can afford independence of opinion and action;—and that is the theory of nobility. But it is the rich man in a true sense, that is to say, not the man of large income and large expenditure, but solely the man whose outlay is less than his income and is steadily kept so.”
—Ralph Waldo Emerson (1803–1882)