Lilliefors test

In statistics, the Lilliefors test, named after Hubert Lilliefors, professor of statistics at George Washington University, is an adaptation of the Kolmogorov-Smirnov test. It is used to test the null hypothesis that data come from a normally distributed population, when the null hypothesis does not specify which normal distribution, i.e. does not specify the expected value and variance.

The test proceeds as follows:

1. First estimate the population mean and population variance based on the data.

2. Then find the maximum discrepancy between the empirical distribution function and the cumulative distribution function (CDF) of the normal distribution with the estimated mean and estimated variance. Just as in the Kolmogorov-Smirnov test, this will be the test statistic.

3. Finally, we confront the question of whether the maximum discrepancy is large enough to be statistically significant, thus requiring rejection of the null hypothesis. This is where this test becomes more complicated than the Kolmogorov-Smirnov test. Since the hypothesized CDF has been moved closer to the data by estimation based on those data, the maximum discrepancy has been made smaller than it would have been if the null hypothesis had singled out just one normal distribution. Thus we need the "null distribution" of the test statistic, i.e. its probability distribution assuming the null hypothesis is true. This is the Lilliefors distribution. To date, tables for this distribution have been computed only by Monte Carlo methods.

4. The test is relatively weak and a large amount of data is typically required to reject the normality hypothesis. A more sensitive test is the Jarque-Bera test which is based on a combination of the estimates of skewness and kurtosis. The Jarque-Bera test is therefore highly attentive to outliers, which the Lilliefors is not.

5. There is an extensive literature on normality testing, but as a practical matter many experienced data analysts sidestep formal testing and assess the feasibility of a normal model by using a graphical tool such as a Q-Q plot.