Free variables and bound variables


 * For the variables called "dummies" in statistics, see indicator variable.

In computer programming, a free variable is a variable referred to in a function that is not a local variable or an argument of that function.

In mathematics, and in other disciplines involving formal languages, including mathematical logic and computer science, a free variable is a notation that specifies which places in an expression where substitution may take place. The idea is related to a placeholder (a symbol that will later be replaced by some literal string), or a wildcard character that stands for an unspecified symbol.

The variable x becomes a bound variable, for example, when we write


 * 'For all x, (x + 1)2 = x2 + 2x + 1.'

or


 * 'There exists x such that x2 = 2.'

In either of these propositions, it does not matter logically whether we use x or some other letter. However, it could be confusing to use the same letter again elsewhere in some compound proposition. That is, free variables become bound, and then in a sense retire from further work supporting the formation of formulae.

Examples
Before stating a precise definition of free variable and bound variable (or dummy variable), we present some examples that perhaps make these two concepts clearer than the definition would (unfortunately the term dummy variable is used by many statisticians to mean an indicator variable or some variant thereof; the name is really not apt for that purpose, but magnificently conveys the intuition behind the definition of this concept):

In the expression


 * $$\sum_{k=1}^{10} f(k,n),$$

n is a free variable and k is a bound variable (or dummy variable); consequently the value of this expression depends on the value of n, but there is nothing called k on which it could depend.

In the expression


 * $$\int_0^\infty x^{y-1} e^{-x}\,dx,$$

y is a free variable and x is a bound variable; consequently the value of this expression depends on the value of y, but there is nothing called x on which it could depend.

In the expression


 * $$\lim_{h\rightarrow 0}\frac{f(x+h)-f(x)}{h},$$

x is a free variable and h is a bound variable; consequently the value of this expression depends on the value of x, but there is nothing called h on which it could depend.

In the expression


 * $$\forall x\ \exists y\ \varphi(x,y,z),$$

z is a free variable and x and y are bound variables; consequently the logical value of this expression depends on the value of z, but there is nothing called x or y on which it could depend.

Variable-binding operators
The following


 * $$\sum_{x\in S}\qquad\qquad \int_0^\infty\cdots\,dx\qquad\qquad \lim_{x\to 0}\qquad\qquad \forall x \qquad\qquad \exists x$$

are variable-binding operators. Each of them binds the variable x.

Formal explanation
Variable-binding mechanisms occur in different contexts in mathematics, logic and computer science but in all cases they are purely syntactic properties of expressions and variables in them. For this section we can summarize syntax by identifying an expression with a tree whose leaf nodes are variables, constants, function constants or predicate constants and whose non-leaf nodes are logical operators. Variable-binding operators are logical operators that occur in almost every formal language. Indeed languages which do not have them are either extremely inexpressive or extremely difficult to use. A binding operator Q takes two arguments: a variable v and an expression P, and when applied to its arguments produces a new expression Q(v, P). The meaning of binding operators is supplied by the semantics of the language and does not concern us here.  $$\forall x\, (\exists y\, A(x) \vee B(z)) $$ Variable binding relates three things: a variable v, a location a for that variable in an expression and a non-leaf node n of the form Q(v, P). Note: we define a location in an expression as a leaf node in the syntax tree. Variable binding occurs when that location is below the node n

To give an example from mathematics, consider an expression which defines a function


 * $$ (x_1, \ldots, x_n) \mapsto \operatorname{t}$$

where t is an expression. t may contain some, all or none of the x1, ..., xn and it may contain other variables. In this case we say that function definition binds the variables x1, ..., xn.

In the lambda calculus, x is a bound variable in the term M = &lambda; x. T, and a free variable of T. We say x is bound in M and free in T. If T contains a subterm &lambda; x. U then x is rebound in this term. This nested, inner binding of x is said to "shadow" the outer binding. Occurrences of x in U are free occurrences of the new x.

Variables bound at the top level of a program are technically free variables within the terms to which they are bound but are often treated specially because they can be compiled as fixed addresses. Similarly, an identifier bound to a recursive function is also technically a free variable within its own body but is treated specially.

A closed term is one containing no free variables.