Conditional Entropy - Chain Rule

Chain Rule

Assume that the combined system determined by two random variables X and Y has entropy, that is, we need bits of information to describe its exact state. Now if we first learn the value of, we have gained bits of information. Once is known, we only need bits to describe the state of the whole system. This quantity is exactly, which gives the chain rule of conditional probability:

Formally, the chain rule indeed follows from the above definition of conditional probability:

\begin{align}
H(Y|X)=&\sum_{x\in\mathcal X, y\in\mathcal Y}p(x,y)\log \frac {p(x)} {p(x,y)}\\ =&-\sum_{x\in\mathcal X, y\in\mathcal Y}p(x,y)\log\,p(x,y) + \sum_{x\in\mathcal X, y\in\mathcal Y}p(x,y)\log\,p(x) \\
=& H(X,Y) + \sum_{x \in \mathcal X} p(x)\log\,p(x) \\
=& H(X,Y) - H(X).
\end{align}

Read more about this topic:  Conditional Entropy

Famous quotes containing the words chain and/or rule:

    Loyalty to petrified opinions never yet broke a chain or freed a human soul in this world—and never will.
    Mark Twain [Samuel Langhorne Clemens] (1835–1910)

    There are two great rules in life, the one general and the other particular. The first is that every one can in the end get what he wants if he only tries. This is the general rule. The particular rule is that every individual is more or less of an exception to the general rule.
    Samuel Butler (1835–1902)