C4.5 Algorithm - Improvements From ID3 Algorithm

Improvements From ID3 Algorithm

C4.5 made a number of improvements to ID3. Some of these are:

  • Handling both continuous and discrete attributes - In order to handle continuous attributes, C4.5 creates a threshold and then splits the list into those whose attribute value is above the threshold and those that are less than or equal to it.
  • Handling training data with missing attribute values - C4.5 allows attribute values to be marked as ? for missing. Missing attribute values are simply not used in gain and entropy calculations.
  • Handling attributes with differing costs.
  • Pruning trees after creation - C4.5 goes back through the tree once it's been created and attempts to remove branches that do not help by replacing them with leaf nodes.

Read more about this topic:  C4.5 Algorithm

Famous quotes containing the word improvements:

    A country whose buildings are of wood, can never increase in its improvements to any considerable degree.... Whereas when buildings are of durable materials, every new edifice is an actual and permanent acquisition to the state, adding to its value as well as to its ornament.
    Thomas Jefferson (1743–1826)