Information Theory and Measure Theory

Information Theory And Measure Theory

Many of the formulas in information theory have separate versions for continuous and discrete cases, i.e. integrals for the continuous case and sums for the discrete case. These versions can often be generalized using measure theory. For discrete random variables, probability mass functions can be considered density functions with respect to the counting measure, thus requiring only basic discrete mathematics for what can be considered, in a measure theory context, integration. Because the same integration expression is used for the continuous case, which uses basic calculus, the same concepts and expressions can be used for both discrete and continuous cases. Consider the formula for the differential entropy of a continuous random variable with probability density function :

This can usually be taken to be

where μ is the Lebesgue measure. But if instead, X is discrete, f is a probability mass function, and ν is the counting measure, we can write:

The integral expression and the general concept is identical to the continuous case; the only difference is the measure used. In both cases the probability density function f is the Radon–Nikodym derivative of the probability measure with respect to the measure against which the integral is taken.

If is the probability measure on X, then the integral can also be taken directly with respect to :

If instead of the underlying measure μ we take another probability measure, we are led to the Kullback–Leibler divergence: let and be probability measures over the same space. Then if is absolutely continuous with respect to, written the Radon–Nikodym derivative exists and the Kullback–Leibler divergence can be expressed in its full generality:

D_\mathrm{KL}(\mathbb P \| \mathbb Q) = \int_{\mathrm{supp}\mathbb P} \frac{\mathrm d\mathbb P}{\mathrm d\mathbb Q} \log \frac{\mathrm d\mathbb P}{\mathrm d\mathbb Q} \,d \mathbb Q = \int_{\mathrm{supp}\mathbb P} \log \frac{\mathrm d\mathbb P}{\mathrm d\mathbb Q} \,d \mathbb P,

where the integral runs over the support of Note that we have dropped the negative sign: the Kullback–Leibler divergence is always non-negative due to Gibbs' inequality.

Read more about Information Theory And Measure Theory:  Entropy As A "measure", Multivariate Mutual Information

Famous quotes containing the words information, theory and/or measure:

    If you have any information or evidence regarding the O.J. Simpson case, press 2 now. If you are an expert in fields relating to the O.J. Simpson case and would like to offer your services, press 3 now. If you would like the address where you can send a letter of support to O.J. Simpson, press 1 now. If you are seeking legal representation from the law offices of Robert L. Shapiro, press 4 now.
    Advertisement. Aired August 8, 1994 by Tom Snyder on TV station CNBC. Chicago Sun Times, p. 11 (July 24, 1994)

    By the “mud-sill” theory it is assumed that labor and education are incompatible; and any practical combination of them impossible. According to that theory, a blind horse upon a tread-mill, is a perfect illustration of what a laborer should be—all the better for being blind, that he could not tread out of place, or kick understandingly.... Free labor insists on universal education.
    Abraham Lincoln (1809–1865)

    To measure life learn thou betimes, and know
    Toward solid good what leads the nearest way;
    For other things mild Heaven a time ordains,

    And disapproves that care, though wise in show,
    That with superfluous burden loads the day,
    And, when God sends a cheerful hour, refrains.
    John Milton (1608–1674)