ID3 algorithm

ID3 (Iterative Dichotomiser 3) is an algorithm used to generate a decision tree.

The algorithm is based on Occam's razor: it prefers smaller decision trees (simpler theories) over larger ones. However, it does not always produce the smallest tree, and is therefore a heuristic. Occam's razor is formalized using the concept of information entropy:


 * $$ I_{E}(i) = - \sum^{m}_{j=1} f (i,j) \log f (i, j) $$

The ID3 algorithm can be summarized as follows:


 * 1) Take all unused attributes and count their entropy concerning test samples
 * 2) Choose attribute for which entropy is minimum
 * 3) Make node containing that attribute

An explanation of the implementation of ID3 can be found at C4.5 algorithm, which is an extended version of ID3.

Algorithm
The actual algorithm is as follows:

ID3 (Examples, Target_Attribute, Attributes)
 * Create a root node for the tree
 * If all examples are positive, Return the single-node tree Root, with label = +.
 * If all examples are negative, Return the single-node tree Root, with label = -.
 * If number of predicting attributes is empty, then Return the single node tree Root, with label = most common       value of the target attribute in the examples.
 * Otherwise Begin
 * A = The Attribute that best classifies examples.
 * Decision Tree attribute for Root = A.
 * For each possible value, $$v_i$$, of A,
 * Add a new tree branch below Root, corresponding to the test A = $$v_i$$.
 * Let Examples($$v_i$$), be the subset of examples that have the value $$v_i$$ for A
 * If Examples($$v_i$$) is empty
 * Then below this new branch add a leaf node with label = most common target value in the examples
 * Else below this new branch add the subtree ID3 (Examples($$v_i$$), Target_Attribute, Attributes – {A})
 * End
 * Return Root