Kendall tau rank correlation coefficient

Overview
The Kendall tau rank correlation coefficient (or simply the Kendall tau coefficient, Kendall's &tau; or Tau test(s)) is used to measure the degree of correspondence between two rankings and assessing the significance of this correspondence. In other words, it measures the strength of association of the cross tabulations.

It was developed by Maurice Kendall in 1938.

Another popular method for computing rank correlation is the Spearman's rank correlation coefficient.

Definition
The Kendall tau coefficient (&tau;) has the following properties:


 * If the agreement between the two rankings is perfect (i.e., the two rankings are the same) the coefficient has value 1.
 * If the disagreement between the two rankings is perfect (i.e., one ranking is the reverse of the other) the coefficient has value -1.
 * For all other arrangements the value lies between -1 and 1, and increasing values imply increasing agreement between the rankings. If the rankings are completely independent, the coefficient has value 0 on average.

Kendall tau coefficient is defined


 * $$\tau = \frac{2P}{\frac{1}{2}{n(n-1)}} - 1 = \frac{4P}{n(n-1)} - 1$$

where n is the number of items, and P is the sum, over all the items, of items ranked after the given item by both rankings.

$$P$$ can also be interpreted as the number of concordant pairs. The denominator in the definition of $$\tau$$ can be interpreted as the total number of pairs of items. So, a high value of $$P$$ means that most pairs are concordant, indicating that the two rankings are consistent. Note that a tied pair is not regarded as concordant or discordant. If there is a large number of ties, the total number of pairs (in the denominator of the expression of $$\tau$$) should be adjusted accordingly.

Tau a, b and c

 * Tau a - This tests the strength of association of the cross tabulations when both variables are measured at the ordinal level but makes no adjustment for ties.
 * Tau b - This tests the strength of association of the cross tabulations when both variables are measured at the ordinal level. It makes adjustments for ties and is most suitable for square tables. Values range from -1 (100% negative association, or perfect inversion) to +1 (100% positive association, or perfect agreement). A value of zero indicates the absence of association.
 * Tau c - This tests the strength of association of the cross tabulations when both variables are measured at the ordinal level. It makes adjustments for ties and is most suitable for rectangular tables. Values range from -1 (100% negative association, or perfect inversion) to +1 (100% positive association, or perfect agreement). A value of zero indicates the absence of association.

Example
Suppose we rank a group of eight people by height and by weight where person A is tallest and third-heaviest, and so on:

We see that there is some correlation between the two rankings but the correlation is far from perfect. We can use the Kendall tau coefficient to objectively measure the degree of correspondence.

Notice in the Weight ranking above that the first entry, 3, has five higher ranks to the right of it; the contribution to P of this entry is 5. Moving to the second entry, 4, we see that there are four higher ranks to the right of it and the contribution to P is 4. Continuing this way, we find that


 * P = 5 + 4 + 5 + 4 + 3 + 1 + 0 + 0 = 22.

Thus $$ \tau= \frac{44}{28}-1 = 0.57$$. This result indicates a strong agreement between the rankings, as expected.