Dummy Variable (statistics) - Incorporating A Dummy Independent Variable

Incorporating A Dummy Independent Variable

Dummy variables are incorporated in the same way as quantitative variables are included (as explanatory variables) in regression models. For example, if we consider a regression model of wage determination, wherein wages are dependent on gender (qualitative) and years of education (quantitative):

Wage = α0 + δ0female + α1education + U

In the model, female = 1 when the person is a female and female = 0 when the person is male. δ0 can be interpreted as: the difference in wages between females and males, keeping education and the error term 'U' constant. Thus, δ0 helps to determine whether there is a discrimination in wages between men and women. If δ0<0 (negative coefficient), then for the same level of education (and other factors influencing wages), women earn a lower wage than men. On the other hand, if δ0>0 (positive coefficient), then women earn a higher wage than men (keeping other factors constant). Note that the coefficients attached to the dummy variables are called differential intercept coefficients. The model can be depicted graphically as an intercept shift between females and males. In the figure, the case δ0<0 is shown (wherein, men earn a higher wage than women).

Dummy variables may be extended to more complex cases. For example, seasonal effects may be captured by creating dummy variables for each of the seasons: D1=1 if the observation is for summer, and equals zero otherwise; D2=1 if and only if autumn, otherwise equals zero; D3=1 if and only if winter, otherwise equals zero; and D4=1 if and only if spring, otherwise equals zero. In the panel data fixed effects estimator dummies are created for each of the units in cross-sectional data (e.g. firms or countries) or periods in a pooled time-series. However in such regressions either the constant term has to be removed, or one of the dummies removed making this the base category against which the others are assessed, for the following reason:

If dummy variables for all categories were included, their sum would equal 1 for all observations, which is identical to and hence perfectly correlated with the vector-of-ones variable whose coefficient is the constant term; if the vector-of-ones variable were also present, this would result in perfect multicollinearity, so that the matrix inversion in the estimation algorithm would be impossible. This is referred to as the dummy variable trap.

Read more about this topic:  Dummy Variable (statistics)

Famous quotes containing the words dummy, independent and/or variable:

    Fathers and Sons is not only the best of Turgenev’s novels, it is one of the most brilliant novels of the nineteenth century. Turgenev managed to do what he intended to do, to create a male character, a young Russian, who would affirm his—that character’s—absence of introspection and at the same time would not be a journalist’s dummy of the socialistic type.
    Vladimir Nabokov (1899–1977)

    For myself I found that the occupation of a day-laborer was the most independent of any, especially as it required only thirty or forty days in a year to support one. The laborer’s day ends with the going down of the sun, and he is then free to devote himself to his chosen pursuit, independent of his labor; but his employer, who speculates from month to month, has no respite from one end of the year to the other.
    Henry David Thoreau (1817–1862)

    There is not so variable a thing in nature as a lady’s head-dress.
    Joseph Addison (1672–1719)