Categorical Distribution

A categorical distribution is a discrete probability distribution whose sample space is the set of k individually identified items. It is the generalization of the Bernoulli distribution for a categorical random variable.

In one formulation of the distribution, the sample space is taken to be a finite sequence of integers. The exact integers used as labels are unimportant; they might be {0, 1, ..., k-1} or {1, 2, ..., k} or any other arbitrary set of values. In the following descriptions, we use {1, 2, ..., k} for convenience, although this disagrees with the convention for the Bernoulli distribution, which uses {0, 1}. In this case, the probability mass function f is:


f(x=i| \boldsymbol{p} ) = p_i ,

where represents the probability of seeing element i and .

Another formulation that appears more complex but facilitates mathematical manipulations is as follows, using the Iverson bracket:


f(x| \boldsymbol{p} ) = \prod_{i=1}^k p_i^{} ,

where evaluates to 1 if, 0 otherwise. There are various advantages of this formulation, e.g.:

  • It is easier to write out the likelihood function of a set of independent identically distributed categorical variables.
  • It connects the categorical distribution with the related multinomial distribution.
  • It shows why the Dirichlet distribution is the conjugate prior of the categorical distribution, and allows the posterior distribution of the parameters to be calculated.

Yet another formulation makes explicit the connection between the categorical and multinomial distributions by treating the categorical distribution as a special case of the multinomial distribution in which the parameter n of the multinomial distribution (the number of sampled items) is fixed at 1. In this formulation, the sample space can be considered to be the set of 1-of-K encoded random vectors x of dimension k having the property that exactly one element has the value 1 and the others have the value 0. The particular element having the value 1 indicates which category has been chosen. The probability mass function f in this formulation is:


f( \mathbf{x}| \boldsymbol{p} ) = \prod_{i=1}^k p_i^{x_i} ,

where represents the probability of seeing element i and . This is the formulation adopted by Bishop.

Read more about Categorical Distribution:  Properties, With A Conjugate Prior, Sampling

Famous quotes containing the words categorical and/or distribution:

    We do the same thing to parents that we do to children. We insist that they are some kind of categorical abstraction because they produced a child. They were people before that, and they’re still people in all other areas of their lives. But when it comes to the state of parenthood they are abruptly heir to a whole collection of virtues and feelings that are assigned to them with a fine arbitrary disregard for individuality.
    Leontine Young (20th century)

    Classical and romantic: private language of a family quarrel, a dead dispute over the distribution of emphasis between man and nature.
    Cyril Connolly (1903–1974)