Akaike Information Criterion - Definition

Definition

In the general case, the AIC is

where k is the number of parameters in the statistical model, and L is the maximized value of the likelihood function for the estimated model.

Given a set of candidate models for the data, the preferred model is the one with the minimum AIC value. Hence AIC not only rewards goodness of fit, but also includes a penalty that is an increasing function of the number of estimated parameters. This penalty discourages overfitting (increasing the number of free parameters in the model improves the goodness of the fit, regardless of the number of free parameters in the data-generating process).

AIC is founded in information theory. Suppose that the data is generated by some unknown process f. We consider two candidate models to represent f: g1 and g2. If we knew f, then we could find the information lost from using g1 to represent f by calculating the Kullback–Leibler divergence, DKL(fg1); similarly, the information lost from using g2 to represent f would be found by calculating DKL(fg2). We would then choose the candidate model that minimized the information loss.

We cannot choose with certainty, because we do not know f. Akaike (1974) showed, however, that we can estimate, via AIC, how much more (or less) information is lost by g1 than by g2. It is remarkable that such a simple formula for AIC results. The estimate, though, is only valid asymptotically; if the number of data points is small, then some correction is often necessary (see AICc, below).

Read more about this topic:  Akaike Information Criterion

Famous quotes containing the word definition:

    One definition of man is “an intelligence served by organs.”
    Ralph Waldo Emerson (1803–1882)

    No man, not even a doctor, ever gives any other definition of what a nurse should be than this—”devoted and obedient.” This definition would do just as well for a porter. It might even do for a horse. It would not do for a policeman.
    Florence Nightingale (1820–1910)

    The definition of good prose is proper words in their proper places; of good verse, the most proper words in their proper places. The propriety is in either case relative. The words in prose ought to express the intended meaning, and no more; if they attract attention to themselves, it is, in general, a fault.
    Samuel Taylor Coleridge (1772–1834)