Interaction analysis (statistics)

In statistics, an interaction is a term in a statistical model in which the effect of two, or more, variables is not simply additive.

Thus, for a response y and two variables x1 and x2 an additive model would be:


 * $$y = ax_1 + bx_2 + \mbox{error},$$

- while,


 * $$y = ax_1 + bx_2 + c(x_1\times x_2) + \mbox{error},$$

- is an example of a model with an interaction between variables x1 and x2 (the word "errors" is not to be construed literally; it refers to a random variable by which y differs from the expected value of y). See errors and residuals in statistics, and note that it is easy to confuse errors with residuals, although the two are different.

Very often the interacting variables are categorical variables rather than real numbers. For example, members of a population may be classified by religion and by occupation. If one wishes to predict a person's height based only on the person's religion and occupation, a simple additive model, i.e., a model without interaction, would add to an overall average height an adjustment for a particular religion and another for a particular occupation. A model with interaction, unlike an additive model, could add a further adjustment for the "interaction" between that religion and that occupation. This example may cause one to suspect that the word interaction is something of a misnomer.

The consequence of an interaction is that the effect of one variable depends on the value of another. This has implications in design of experiments as it is misleading to vary one factor at a time.

Real-world examples of systems that manifest interactions include:


 * Interaction between adding sugar to coffee and stirring the coffee. Neither of the two individual variables has much effect on sweetness but a combination of the two does.


 * Interaction between adding carbon to steel and quenching. Neither of the two individually has much effect on strength but a combination of the two has a dramatic effect.

Genichi Taguchi contended that interactions could be eliminated from a system by appropriate choice of response variable and transformation. However George Box and others have argued that this is not the case in general.