Naive Bayes Classifier - The Naive Bayes Probabilistic Model

The Naive Bayes Probabilistic Model

Abstractly, the probability model for a classifier is a conditional model.

over a dependent class variable with a small number of outcomes or classes, conditional on several feature variables through . The problem is that if the number of features is large or when a feature can take on a large number of values, then basing such a model on probability tables is infeasible. We therefore reformulate the model to make it more tractable.

Using Bayes' theorem, we write

In plain English the above equation can be written as

In practice we are only interested in the numerator of that fraction, since the denominator does not depend on and the values of the features are given, so that the denominator is effectively constant. The numerator is equivalent to the joint probability model

which can be rewritten as follows, using the chain rule for repeated applications of the definition of conditional probability:

Now the "naive" conditional independence assumptions come into play: assume that each feature is conditionally independent of every other feature for given the class . This means that

for, and so the joint model can be expressed as

$\begin{align} p(C \vert F_1, \dots, F_n) & \varpropto p(C) \ p(F_1\vert C) \ p(F_2\vert C) \ p(F_3\vert C) \ \cdots\, \\ & \varpropto p(C) \prod_{i=1}^n p(F_i \vert C).\, \end{align}$

This means that under the above independence assumptions, the conditional distribution over the class variable can be expressed like this:

where (the evidence) is a scaling factor dependent only on, i.e., a constant if the values of the feature variables are known.

Models of this form are much more manageable, since they factor into a so-called class prior and independent probability distributions . If there are classes and if a model for each can be expressed in terms of parameters, then the corresponding naive Bayes model has (k − 1) + n r k parameters. In practice, often (binary classification) and (Bernoulli variables as features) are common, and so the total number of parameters of the naive Bayes model is, where is the number of binary features used for classification and prediction.

Read more about this topic: Naive Bayes Classifier

Famous quotes containing the words naive and/or model:

““The days have outnumbered
my fingers and toes.
What can I count with now?”
Saying this,
the naive girl cries.”
—Hla Stavhana (c. 50 A.D.)

“AIDS occupies such a large part in our awareness because of what it has been taken to represent. It seems the very model of all the catastrophes privileged populations feel await them.”
—Susan Sontag (b. 1933)