Decision Tree Learning - Types

Types

Decision trees used in data mining are of two main types:

  • Classification tree analysis is when the predicted outcome is the class to which the data belongs.
  • Regression tree analysis is when the predicted outcome can be considered a real number (e.g. the price of a house, or a patient’s length of stay in a hospital).

The term Classification And Regression Tree (CART) analysis is an umbrella term used to refer to both of the above procedures, first introduced by Breiman et al. Trees used for regression and trees used for classification have some similarities - but also some differences, such as the procedure used to determine where to split.

Some techniques, often called ensemble methods, construct more than one decision tree:

  • Bagging decision trees, an early ensemble method, builds multiple decision trees by repeatedly resampling training data with replacement, and voting the trees for a consensus prediction.
  • A Random Forest classifier uses a number of decision trees, in order to improve the classification rate.
  • Boosted Trees can be used for regression-type and classification-type problems.
  • Rotation forest - in which every decision tree is trained by first applying principal component analysis (PCA) on a random subset of the input features.

Decision tree is the learning of decision tree from class labeled training tuples. A decision tree is a flow chart like structure, where each internal (non-leaf) node denotes a test on an attribute, each branch represents an outcome of the test, and each leaf (or terminal) node holds a class label. The topmost node in tree is the root node.

There are many specific decision-tree algorithms. Notable ones include:

  • ID3 (Iterative Dichotomiser 3)
  • C4.5 algorithm, successor of ID3
  • CART (Classification And Regression Tree)
  • CHi-squared Automatic Interaction Detector (CHAID). Performs multi-level splits when computing classification trees.
  • MARS: extends decision trees to better handle numerical data

ID3 and CART are invented independently of one another at around same time(b/w 1970-1980), yet follow a similar approach for learning decision tree from training tuples.

Read more about this topic:  Decision Tree Learning

Famous quotes containing the word types:

    The bourgeoisie loves so-called “positive” types and novels with happy endings since they lull one into thinking that it is fine to simultaneously acquire capital and maintain one’s innocence, to be a beast and still be happy.
    Anton Pavlovich Chekhov (1860–1904)

    The American man is a very simple and cheap mechanism. The American woman I find a complicated and expensive one. Contrasts of feminine types are possible. I am not absolutely sure that there is more than one American man.
    Henry Brooks Adams (1838–1918)

    As for types like my own, obscurely motivated by the conviction that our existence was worthless if we didn’t make a turning point of it, we were assigned to the humanities, to poetry, philosophy, painting—the nursery games of humankind, which had to be left behind when the age of science began. The humanities would be called upon to choose a wallpaper for the crypt, as the end drew near.
    Saul Bellow (b. 1915)