Dummy Variable (statistics) - Incorporating A Dummy Independent Variable

Incorporating A Dummy Independent Variable

Dummy variables are incorporated in the same way as quantitative variables are included (as explanatory variables) in regression models. For example, if we consider a regression model of wage determination, wherein wages are dependent on gender (qualitative) and years of education (quantitative):

Wage = α0 + δ0female + α1education + U

In the model, female = 1 when the person is a female and female = 0 when the person is male. δ0 can be interpreted as: the difference in wages between females and males, keeping education and the error term 'U' constant. Thus, δ0 helps to determine whether there is a discrimination in wages between men and women. If δ0<0 (negative coefficient), then for the same level of education (and other factors influencing wages), women earn a lower wage than men. On the other hand, if δ0>0 (positive coefficient), then women earn a higher wage than men (keeping other factors constant). Note that the coefficients attached to the dummy variables are called differential intercept coefficients. The model can be depicted graphically as an intercept shift between females and males. In the figure, the case δ0<0 is shown (wherein, men earn a higher wage than women).

Dummy variables may be extended to more complex cases. For example, seasonal effects may be captured by creating dummy variables for each of the seasons: D1=1 if the observation is for summer, and equals zero otherwise; D2=1 if and only if autumn, otherwise equals zero; D3=1 if and only if winter, otherwise equals zero; and D4=1 if and only if spring, otherwise equals zero. In the panel data fixed effects estimator dummies are created for each of the units in cross-sectional data (e.g. firms or countries) or periods in a pooled time-series. However in such regressions either the constant term has to be removed, or one of the dummies removed making this the base category against which the others are assessed, for the following reason:

If dummy variables for all categories were included, their sum would equal 1 for all observations, which is identical to and hence perfectly correlated with the vector-of-ones variable whose coefficient is the constant term; if the vector-of-ones variable were also present, this would result in perfect multicollinearity, so that the matrix inversion in the estimation algorithm would be impossible. This is referred to as the dummy variable trap.

Read more about this topic:  Dummy Variable (statistics)

Famous quotes containing the words dummy, independent and/or variable:

    Fathers and Sons is not only the best of Turgenev’s novels, it is one of the most brilliant novels of the nineteenth century. Turgenev managed to do what he intended to do, to create a male character, a young Russian, who would affirm his—that character’s—absence of introspection and at the same time would not be a journalist’s dummy of the socialistic type.
    Vladimir Nabokov (1899–1977)

    The ex-Presidential situation has its advantages, but with them are certain drawbacks. The correspondence is large. The meritorious demands on one are large. More independent out than in place, but still something of the bondage of the place that was willingly left. On the whole, however, I find many reasons to be content.
    Rutherford Birchard Hayes (1822–1893)

    Walked forth to ease my pain
    Along the shore of silver streaming Thames,
    Whose rutty bank, the which his river hems,
    Was painted all with variable flowers,
    Edmund Spenser (1552?–1599)