Interaction (statistics)

In statistics, an interaction is a term in a statistical model added when the effect of two or more variables is not simply additive. Such a term reflects that the effect of one variable depends on the values of one or more other variables.

Thus, for a response Y and two variables x1 and x2 an additive model would be:


 * $$Y = ax_1 + bx_2 + \mbox{error}\,\!$$

In contrast to this,


 * $$Y = ax_1 + bx_2 + c(x_1\times x_2) + \mbox{error},$$

is an example of a model with an interaction between variables x1 and x2 ("error" refers to the random variable whose value by which y differs from the expected value of y).

Very often the interacting variables are categorical variables rather than real numbers. For example, members of a population may be classified by religion and by occupation. If one wishes to predict a person's height based only on the person's religion and occupation, a simple additive model, i.e., a model without interaction, would add to an overall average height an adjustment for a particular religion and another for a particular occupation. A model with interaction, unlike an additive model, could add a further adjustment for the "interaction" between that religion and that occupation. This example may cause one to suspect that the word interaction is something of a misnomer.

The consequence of an interaction is that the effect of one variable depends on the value of another. This has implications in design of experiments as it is misleading to vary one factor at a time.

Real-world examples of systems that manifest interactions include:


 * Interaction between adding sugar to coffee and stirring the coffee. Neither of the two individual variables has much effect on sweetness but a combination of the two does.


 * Interaction between adding carbon to steel and quenching. Neither of the two individually has much effect on strength but a combination of the two has a dramatic effect.

Genichi Taguchi contended that interactions could be eliminated from a system by appropriate choice of response variable and transformation. However George Box and others have argued that this is not the case in general.