Variable rules analysis

In linguistics, variable rules analysis is a set of statistical analysis methods commonly used in sociolinguistics and historical linguistics to describe patterns of variation between alternative forms in language use. It is also sometimes known as Varbrul analysis, after the name of a software package dedicated to carrying out the relevant statistical computations (Varbrul, from "variable rule".) The method goes back to a theoretical approach developed by the sociolinguist William Labov in the late 1960s and early 1970s, and its mathematical implementation was developed by Henrietta Cedergren and David Sankoff in 1974.

A variable rules analysis is designed to provide a quantitative model of a situation where speakers alternate between different forms that have the same meaning and stand in free variation, but in such a way that the probability of choice of either the one or the other form is conditioned by a variety of context factors or social characteristics. Such a situation, where variation is not entirely random but rule-governed, is also known as "structured variation". A variable rules analysis computes a multivariate statistical model, on the basis of observed token counts, such that each determining factor is assigned a numerical factor weight that describes how it influences the probabilities of choice of either form. This is done by means of stepwise logistic regression, using a maximum likelihood algorithm.

Although the necessary computations required for a variable rules analysis can be carried out with the help of mainstream general-purpose statistics software packages such as SPSS, it is more often done by means of a specialised software dedicated to the needs of linguists, called Varbrul. It was originally written by David Sankoff and currently exists in freeware implementations for Mac OS and Microsoft Windows, under the title of Goldvarb. There is also a version known as R-Varb implemented in the statistical language R and therefore available on most platforms.

Variable rules approaches are commonly employed for the analysis of corpus data in sociolinguistic research, especially in studies that aim to investigate how reflexes of linguistic change through time appear in the shape of structured variation patterns within a speech community.