Information Gain in Decision Trees

Information Gain In Decision Trees

In information theory and machine learning, information gain is an alternative synonym for Kullback–Leibler divergence.

In particular, the information gain about a random variable X obtained from an observation that a random variable A takes the value A=a is the Kullback-Leibler divergence DKL(p(x | a) || p(x | I)) of the prior distribution p(x | I) for x from the posterior distribution p(x | a) for x given a.

The expected value of the information gain is the mutual information I(X; A) of X and A — i.e. the reduction in the entropy of X achieved by learning the state of the random variable A.

In machine learning this concept can be used to define a preferred sequence of attributes to investigate to most rapidly narrow down the state of X. Such a sequence (which depends on the outcome of the investigation of previous attributes at each stage) is called a decision tree. Usually an attribute with high information gain should be preferred to other attributes.

Read more about Information Gain In Decision Trees:  General Definition, Formal Definition, Drawbacks, Constructing A Decision Tree Using Information Gain

Famous quotes containing the words information, gain, decision and/or trees:

    English literature is a kind of training in social ethics.... English trains you to handle a body of information in a way that is conducive to action.
    Marilyn Butler (b. 1937)

    When goods increase, those who eat them increase; and what gain has their owner but to see them with his eyes?
    Bible: Hebrew, Ecclesiastes 5:11.

    The decision to have a child is both a private and a public decision, for children are our collective future.
    Sylvia Ann Hewitt (20th century)

    The trees that have it in their pent-up buds
    To darken nature and be summer woods—
    Robert Frost (1874–1963)