Improvements From ID3 Algorithm
C4.5 made a number of improvements to ID3. Some of these are:
- Handling both continuous and discrete attributes - In order to handle continuous attributes, C4.5 creates a threshold and then splits the list into those whose attribute value is above the threshold and those that are less than or equal to it.
- Handling training data with missing attribute values - C4.5 allows attribute values to be marked as ? for missing. Missing attribute values are simply not used in gain and entropy calculations.
- Handling attributes with differing costs.
- Pruning trees after creation - C4.5 goes back through the tree once it's been created and attempts to remove branches that do not help by replacing them with leaf nodes.
Read more about this topic: C4.5 Algorithm
Famous quotes containing the word improvements:
“The improvements of ages have had but little influence on the essential laws of mans existence: as our skeletons, probably, are not to be distinguished from those of our ancestors.”
—Henry David Thoreau (18171862)