Incomplete Data
Rough set theory is useful for rule induction from incomplete data sets. Using this approach we can distinguish between three types of missing attribute values: lost values (the values that were recorded but currently are unavailable), attribute-concept values (these missing attribute values may be replaced by any attribute value limited to the same concept), and "do not care" conditions (the original values were irrelevant). A concept (class) is a set of all objects classified (or diagnosed) the same way.
Two special data sets with missing attribute values were extensively studied: in the first case, all missing attribute values were lost (Stefanowski and Tsoukias, 2001), in the second case, all missing attribute values were "do not care" conditions (Kryszkiewicz, 1999).
In attribute-concept values interpretation of a missing attribute value, the missing attribute value may be replaced by any value of the attribute domain restricted to the concept to which the object with a missing attribute value belongs (Grzymala-Busse and Grzymala-Busse, 2007). For example, if for a patient the value of an attribute Temperature is missing, this patient is sick with flu, and all remaining patients sick with flu have values high or very-high for Temperature when using the interpretation of the missing attribute value as the attribute-concept value, we will replace the missing attribute value with high and very-high. Additionally, the characteristic relation, (see, e.g., Grzymala-Busse and Grzymala-Busse, 2007) enables to process data sets with all three kind of missing attribute values at the same time: lost, "do not care" conditions, and attribute-concept values.
Read more about this topic: Rough Set
Famous quotes containing the words incomplete and/or data:
“Someone once asked me why women dont gamble as much as men do, and I gave the common-sensical reply that we dont have as much money. That was a true but incomplete answer. In fact, womens total instinct for gambling is satisfied by marriage.”
—Gloria Steinem (b. 1934)
“To write it, it took three months; to conceive it three minutes; to collect the data in itall my life.”
—F. Scott Fitzgerald (18961940)