Indicator function

In mathematics, an indicator function or a characteristic function is a function defined on a set $$X$$ that indicates membership of an element in a subset $$A$$ of $$X$$.

The indicator function of a subset $$A$$ of a set $$X$$ is a function


 * $$\mathbf{1}_A : X \to \lbrace 0,1 \rbrace \,$$

defined as


 * $$\mathbf{1}_A(x) =

\left\{\begin{matrix} 1 &\mbox{if}\ x \in A, \\ 0 &\mbox{if}\ x \notin A. \end{matrix}\right. $$

The Iverson bracket allows the notation $$[x \in A]$$.

The indicator function of $$A$$ is sometimes denoted
 * $$\chi_A(x)$$ or $$\mathbf{I}_A(x)$$ or even $$A(x).$$

(The Greek letter χ because it is the initial letter of the Greek etymon of the word characteristic.)

Remark on notation and terminology

 * The notation $$\mathbf{1}_A $$ may signify the identity function.
 * The notation $$\chi_{A}$$ may signify the characteristic function in convex analysis.

A related concept in statistics is that of a dummy variable (this must not be confused with "dummy variables" as that term is usually used in mathematics, also called a bound variable).

The term "characteristic function" has an unrelated meaning in probability theory. For this reason, probabilists use the term indicator function for the function defined here almost exclusively, while mathematicians in other fields are more likely to use the term characteristic function to describe the function which indicates membership in a set.

Basic properties
Boolos, Burgess, and Jeffrey (2002) define the characteristic function as follows:
 * "The characteristic function of a k-place relation is the k-argument function that takes the value 1 for a k-tuple if the relation holds of the k-tuple, and the value 0 if it does not; and a relation is effectively decidable if its characteristic function is effectively computable, and is (primitive) recursive if its characteristic function is (primitive) recursive." (italics in original, p.73–74)

The mapping which associates a subset $$A$$ of $$X$$ to its indicator function $$\mathbf{1}_A$$ is injective; its range is the set of functions $$f : X \to \{0,1\}$$.

In the following, the "dot" is a sign that represents algebraic multiplication i.e. 1*1 = 1, 1*0 = 0 etc, and likewise the "+" and "-" represent algebraic addition and subtraction. If $$A$$ and $$B$$ are two subsets of $$X$$, then
 * $$\mathbf{1}_{A\cap B} = \min\{\mathbf{1}_A,\mathbf{1}_B\} = \mathbf{1}_A \cdot\mathbf{1}_B,\,$$
 * $$\mathbf{1}_{A\cup B} = \max\{{\mathbf{1}_A,\mathbf{1}_B}\} = \mathbf{1}_A + \mathbf{1}_B - \mathbf{1}_A \cdot\mathbf{1}_B,$$
 * $$\mathbf{1}_{A\triangle B} = \mathbf{1}_A + \mathbf{1}_B - 2\cdot\mathbf{1}_A \cdot\mathbf{1}_B,$$

and the "complement" of the indicator function of A i.e. AC is:
 * $$\mathbf{1}_{A^\complement} = 1-\mathbf{1}_A. $$

If the functions A, B and C are Boolean in nature, i.e. they only take on values { 0, 1 } and evaluate to only { 0, 1 } then their indicator functions also evaluate to { 0, 1 }, and the above four formulas represent the logical AND, inclusive-OR, exclusive-OR, and NOT (i.e. logical inverse), respectively.

More generally, suppose $$A_1, \ldots, A_n$$ is a collection of subsets of $$X$$. For any $$x \in X$$,


 * $$ \prod_{k \in I} ( 1 - \mathbf{1}_{A_k}(x))$$

is clearly a product of $$0$$s and $$1$$s. This product has the value 1 at precisely those $$x \in X$$ which belong to none of the sets $$A_k$$ and is $$0$$ otherwise. That is


 * $$ \prod_{k \in I} ( 1 - \mathbf{1}_{A_k}) = \mathbf{1}_{X - \bigcup_{k} A_k} = 1 - \mathbf{1}_{\bigcup_{k} A_k}.$$

Expanding the product on the left hand side,


 * $$ \mathbf{1}_{\bigcup_{k} A_k}= 1 - \sum_{F \subseteq \{1, 2, \ldots, n\}} (-1)^{|F|} \mathbf{1}_{\bigcap_F A_k} = \sum_{\emptyset \neq F \subseteq \{1, 2, \ldots, n\}} (-1)^{|F|+1} \mathbf{1}_{\bigcap_F A_k} $$

where $$|F|$$ is the cardinality of $$F$$. This is one form of the principle of inclusion-exclusion.

As suggested by the previous example, the indicator function is a useful notational device in combinatorics. The notation is used in other places as well, for instance in probability theory: if $$X$$ is a probability space with probability measure $$\mathbb{P}$$ and $$A$$ is a measurable set, then $$\mathbf{1}_A$$ becomes a random variable whose expected value is equal to the probability of $$A:$$


 * $$E(\mathbf{1}_A)= \int_{X} \mathbf{1}_A(x)\,dP = \int_{A} dP = P(A).\quad $$

This identity is used in a simple proof of Markov's inequality.

In many cases, such as order theory, the inverse of the indicator function may be defined. This is commonly called the generalized Möbius function, as a generalization of the inverse of the indicator function in elementary number theory, the Möbius function. (See paragraph below about the use of the inverse in classical recursion theory.)

Characteristic function in recursion theory, Gödel's and Kleene's representing function
Kurt Gödel described the representing function in his 1934 paper "On Undecidable Propositions of Formal Mathematical Systems". (The paper appears on pp. 41-74 in Martin Davis ed. The Undecidable):
 * "There shall correspond to each class or relation R a representing function φ(x1, . . ., xn) = 0 if R(x1, . . ., xn) and φ(x1, . . ., xn)=1 if ~R(x1, . . ., xn)." (p. 42; the "~" indicates logical inversion i.e. "NOT")

Stephen Kleene (1952) (p. 227) offers up the same definition in the context of the primitive recursive functions as a function φ of a predicate P, takes on values 0 if the predicate is true and 1 if the predicate is false.

For example, because the product of characteristic functions φ1*φ2*. . . *φn = 0 whenever any one of the functions equals 0, it plays the role of logical OR: IF φ1=0 OR φ2=0 OR. . . OR φn=0 THEN their product is 0. What appears to the modern reader as the representing function's logical-inversion, i.e. the representing function is 0 when the function R is "true" or satisfied", plays a useful role in Kleene's definition of the logical functions OR, AND, and IMPLY (p. 228), the bounded- (p. 228) and unbounded- (p. 279ff) mu operators (Kleene (1952)) and the CASE function (p. 229).

Characteristic function in fuzzy set theory
In classical mathematics, characteristic functions of sets only take values 1 (members) or 0 (non-members). In fuzzy set theory, characteristic functions are generalized to take value in the real unit interval [0, 1], or more generally, in some algebra or structure (usually required to be at least a poset or lattice). Such generalized characteristic functions are more usually called membership functions, and the corresponding "sets" are called fuzzy sets. Fuzzy sets model the gradual change in the membership degree seen in many real-world predicates like "tall", "warm", etc.