F-test

An F-test is any statistical test in which the test statistic has an F-distribution if the null hypothesis is true. The name was coined by George W. Snedecor, in honour of Sir Ronald A. Fisher. Fisher initially developed the statistic as the variance ratio in the 1920s. Examples include:


 * The hypothesis that the means of multiple normally distributed populations, all having the same standard deviation, are equal. This is perhaps the most well-known of hypotheses tested by means of an F-test, and the simplest problem in the analysis of variance (ANOVA).


 * The hypothesis that the standard deviations of two normally distributed populations are equal, and thus that they are of comparable origin.

Note that if it is equality of variances (or standard deviations) that is being tested, the F-test is extremely non-robust to non-normality. That is, even if the data displays only modest departures from the normal distribution, the test is unreliable and should not be used.

In many cases, the F-test statistic can be calculated through a straightforward process. Two regression models are required, one of which constrains one or more of the regression coefficients according to the null hypothesis. The test statistic is then based on a modified ratio of the sum of squares of residuals of the two models as follows:

Consider two models, 1 and 2, where model 1 is nested within model 2. That is, model 1 has p1 parameters, and model 2 has p2 parameters, where p2 &gt; p1. (Any constant parameter in the model is included when counting the parameters. For instance, the simple linear model y = mx + b has p = 2 under this convention.) If there are n data points to estimate parameters of both models from, then calculate F as


 * $$F=\frac{\left(\frac{\mbox{RSS}_1 - \mbox{RSS}_2 }{p_2 - p_1}\right)}{\left(\frac{\mbox{RSS}_2}{n - p_2}\right)}$$

where RSSi is the residual sum of squares of model i. If your regression model has been calculated with weights, then replace RSSi with &chi;2, the weighted sum of squared residuals. F here is distributed as an F-distribution, with (p2 &minus; p1, n &minus; p2) degrees of freedom; the probability that the decrease in &chi;2 associated with the addition of p2 &minus; p1 parameters is solely due to chance is given by the probability associated with the F distribution at that point. The null hypothesis, that none of the additional p2 &minus; p1 parameters is significantly correlated with the model, is rejected if the calculated F is greater than the F given by the critical value of F for some desired rejection probability (e.g. p = 0.05).

A table of F-test critical values can be found here and is usually included in most statistical texts.