In statistics, a **multinomial logistic regression** model, also known as **softmax regression** or **multinomial logit**, is a regression model which generalizes logistic regression by allowing more than two discrete outcomes. That is, it is a model that is used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables (which may be real-valued, binary-valued, categorical-valued, etc.). The use of the term "multinomial" in the name arises from the common conflation between the categorical and multinomial distributions, as explained in the relevant articles. However, it should be kept in mind that the actual goal of the multinomial logit model is to predict categorical data.

In some fields of machine learning (e.g. natural language processing), when a classifier is implemented using a multinomial logit model, it is commonly known as a **maximum entropy classifier**, **conditional maximum entropy model** or **MaxEnt model** for short. Maximum entropy classifiers are commonly used as alternatives to Naive Bayes classifiers because they do not assume statistical independence of the independent variables (commonly known as *features*) that serve as predictors. However, learning in such a model is slower than for a Naive Bayes classifier, and thus may not be appropriate given a very large number of classes to learn. In particular, learning in a Naive Bayes classifier is a simple matter of counting up the number of cooccurrences of features and classes, while in a maximum entropy classifier the weights, which are typically maximized using maximum a posteriori (MAP) estimation, must be learned using an iterative procedure; see below.

Read more about Multinomial Logit: Introduction, Assumptions, Estimation of Intercept, Applications