Mutually Exclusive Events - Statistics

Statistics

In statistics and regression analysis, an independent variable that can take on only two possible values is called a dummy variable. For example, it may take on the value 0 if an observation is of a male subject or 1 if the observation is of a female subject. The two possible categories associated with the two possible values are mutually exclusive, so that no observation falls into more than one category, and the categories are exhaustive, so that every observation falls into some category. Sometimes there are three or more possible categories, which are pairwise mutually exclusive and are collectively exhaustive — for example, under 18 years of age, 18 to 64 years of age, and age 65 or above. In this case a set of dummy variables is constructed, each dummy variable having two mutually exclusive and jointly exhaustive categories — in this example, one dummy variable (called D1) would equal 1 if age is less than 18, and would equal 0 otherwise; a second dummy variable (called D2) would equal 1 if age is in the range 18-64, and 0 otherwise. In this set-up, the dummy variable pairs (D1, D2) can have the values (1,0) (under 18), (0,1) (between 18 and 64), or (0,0) (65 or older) (but not (1,1), which would nonsensically imply that an observed subject is both under 18 and between 18 and 64). Then the dummy variables can be included as independent (explanatory) variables in a regression. Note that the number of dummy variables is always one less than the number of categories: with the two categories male and female there is a single dummy variable to distinguish them, while with the three age categories two dummy variables are needed to distinguish them.

Such qualitative data can also be used for dependent variables. For example, a researcher might want to predict whether someone goes to college or not, using family income, a gender dummy variable, and so forth as explanatory variables. Here the variable to be explained is a dummy variable that equals 0 if the observed subject does not go to college and equals 1 if the subject does go to college. In such a situation, ordinary least squares (the basic regression technique) is widely seen as inadequate; instead probit regression or logistic regression is used. Further, sometimes there are three or more categories for the dependent variable — for example, no college, community college, and four-year college. In this case, the multinomial probit or multinomial logit technique is used.

Read more about this topic:  Mutually Exclusive Events

Famous quotes containing the word statistics:

    Maybe a nation that consumes as much booze and dope as we do and has our kind of divorce statistics should pipe down about “character issues.” Either that or just go ahead and determine the presidency with three-legged races and pie-eating contests. It would make better TV.
    —P.J. (Patrick Jake)

    July 4. Statistics show that we lose more fools on this day than in all the other days of the year put together. This proves, by the number left in stock, that one Fourth of July per year is now inadequate, the country has grown so.
    Mark Twain [Samuel Langhorne Clemens] (1835–1910)

    We already have the statistics for the future: the growth percentages of pollution, overpopulation, desertification. The future is already in place.
    Günther Grass (b. 1927)