Variable

In computer science and mathematics, a variable (sometimes called an object or identifier in computer science) is a symbolic representation used to denote a quantity or expression. In mathematics, a variable often represents an "unknown" quantity that has the potential to change; in computer science, it represents a place where a quantity can be stored. Variables are often contrasted with constants, which are known and unchanging.

The term has a similar meaning in the physical sciences and engineering: a variable is a quantity whose value may vary over the course of an experiment (including simulations), across samples, or during the operation of a system. Variables are generally distinct from parameters, although what is a variable in one context may be a parameter in another. For more on this distinction, see the article on "parameter".

In applied statistics, a variable is a measurable factor, characteristic, or attribute of an individual or a system&mdash;in other words, something that might be expected to vary over time or between individuals. Random variables are an idealization of this in mathematical statistics, where they are defined as measurable functions from a probability space to a measurable space.

History
$$\mathit{x}$$ commonly represents an unknown variable. Even though any letter can be used, $$\mathit{x}$$ is the most common choice. This usage can be traced back to the Arabic word šay '  شيء = “thing”, which in translated algebra texts and similar was taken into Old Spanish with the pronunciation “šei”, which was written xei, which was soon habitually abbreviated to $$\mathit{x}$$. (The Spanish pronunciation of “x” has changed since.) But some sources say that this $$\mathit{x}$$ is an abbreviation of Latin causa which was a translation of Arabic شيء. That started the habit of using letters to represent quantities in algebra. In mathematics, an “italicized x” ($$x\!$$) is often used to avoid potential confusion with the multiplication symbol. By extension beyond mathematics, “X” has come to represent a generic placeholder variable whose value is unknown or secret, as in project X or mister X.

General overview
Variables are used in open sentences. For instance, in the formula x + 1 = 5, x is a variable which represents an "unknown" number. Variables are often represented by letters of the Roman alphabet, but are also represented by letters of other alphabets, such as the Greek alphabet, as well as various other symbols. In this sense, variables are used as a "fill-in-the-blank" within many fields (mathematics, linguistics, etc.)

Variable Naming Conventions
The names of variables used within a discipline often following some naming convention.

In mathematics, very common letters for variables are "x", "y", "n", "a" and "b". "x" and "y" are often used because they correspond to the two axis on a graph, while "a" and "b" are used as the coefficients of x and y in the general form of a linear equation. "n" is often used in statistical analysis, eg, "n" being the number of subjects in a study.

In mathematics
Variables are useful in mathematics because they allow instructions to be specified in a general way. If one were forced to use actual values, then the instructions would only apply in a more narrow set of situations. For example:
 * Specify a mathematical definition for finding the number twice that of ANY number: double(x) = x + x.


 * Now, all we need to do to find the double of a number is replace x with any number we want.


 * double(1) = 1 + 1 = 2
 * double(3) = 3 + 3 = 6
 * double(55) = 55 + 55 = 110
 * etc.

In the above example, the variable x is a "placeholder" for any number. One important thing we are assuming is that the value of each occurrence of x is the same&mdash;that x does not get a new value between the first x and the second x.

(Note that in computer programming languages without referential transparency, changes such as this can occur. Variables in computer programming are also useful for this reason. The term "variable", as used by programmers, is different from the meaning of "variable" as used by mathematicians.)

In applied statistics
In statistics, variables refer to measurable attributes, as these typically vary over time or between individuals. Variables can be discrete (taking values from a finite or countable set), continuous (having a continuous distribution function), or neither. This is referred to as the level of measurement. Temperature is a continuous variable, while the number of legs of an animal is a discrete variable. This concept of a variable is widely used in the natural, medical and social sciences.

In causal models, a distinction is made between "independent variables" and "dependent variables", the latter being expected to vary in value in response to changes in the former. In other words, an independent variable is presumed to potentially affect a dependent one. In experiments, independent variables include factors that can be altered or chosen by the researcher independent of other factors.

For example, in an experiment to test whether or not the boiling point of water changes with altitude, the altitude is under direct control and is the independent variable, and the boiling point is presumed to depend upon it and is therefore the dependent variable. The collection of results from an experiment, or information to be used to draw conclusions, is known as data. It is often important to consider which variables to allow for, or to directly control or eliminate, in the design of experiments.

There are also quasi-independent variables, which are those variables that are used by researcher as a grouping mechanism, without manipulating the variable. An example of this would be separating people into groups by their gender. Gender cannot be manipulated, but it is used as a way to group. Another example would be separating people on the amount of coffee they drank before beginning an experiment. The researcher cannot change the past, but can use it to differentiate the groups.

While independent variables can refer to quantities and qualities that are under experimental control, they can also include extraneous factors that influence results in a confusing or undesired manner.

In general, if strongly confounding variables exist that can substantially affect the result, then this makes it more difficult to interpret the results. For example, a study into the incidence of cancer with age will also have to take into account variables such as income (poorer people may have less healthy lives), location (some cancers vary depending on diet and sunlight), stress and lifestyle issues (cancer may be related to these more than age), and so on. Failure to at least consider these factors can lead to grossly inaccurate deductions. For this reason, controlling unwanted variables is important in research.

In computer programming
Variables in computer programming are very different from variables in mathematics and the apparent similarity is source of much confusion. Variables in most of mathematics (those that are extensional and referentially transparent) are time-independent unknowns, while in programming a variable can associate with different values at different times (as they are intensional).

In computer programming a variable is a special value (also often called a reference) that has the property of being able to be associated with another value (or not). What is variable across time is the association. Obtaining the value associated with a variable is often called dereferencing, and creating or changing the association is called assignment.

Variables are usually named by an identifier, but they can be anonymous, and variables can be associated with other variables.

In the computing context, variable identifiers often consist of alphanumeric strings. These identifiers are then used to refer to values in computer memory. This convention of matching identifiers to values is but one of several alternative programmatic conventions for accessing values in computer memory (see also: reflection (computer science)).

Variable naming conventions
In some programming languages, specific characters (known as sigils) are prefixed or appended to variable identifiers to indicate the variable's type. For example:
 * in BASIC, the suffix $ on a variable name indicates that its value is a string;
 * in Perl, the sigils $, @, %, and & indicate scalar, array, hash, and subroutine variables, respectively.
 * in spreadsheets variables can refer to cells (e.g. $A$2), named ranges, or values in associated source code or functions.

Variables in source code
In computer source code, a variable name is one way to bind a variable to a memory location; the corresponding value is stored as a data object in that location so that the object can be accessed and manipulated later via the variable's name.

Variables in spreadsheets
In a spreadsheet, a cell may contain a formula with references to other cells. Such a cell reference is a kind of variable; its value is the value of the referenced cell (see also: reference (computer science)).

Scope and extent
The scope of a variable describes where in a program's text, the variable may be used, while the extent (or lifetime) describes when in a program's execution a variable has a value. The scope of a variable is actually a property of the name of the variable, and the extent is a property of the variable itself.

A variable name's scope affects its extent.

Scope is a lexical aspect of a variable. Most languages define a specific scope for each variable (as well as any other named entity), which may differ within a given program. The scope of a variable is the portion of the program code for which the variable's name has meaning and for which the variable is said to be "visible". Entrance into that scope typically begins a variable's lifetime and exit from that scope typically ends its lifetime. For instance, a variable with "lexical scope" is meaningful only within a certain block of statements or subroutine. A "global variable", or one with indefinite scope, may be referred to anywhere in the program. It is erroneous to refer to a variable where it is out of scope. Lexical analysis of a program can determine whether variables are used out of scope. In compiled languages, such analysis can be performed statically at compile time.

Extent, on the other hand, is a runtime (dynamic) aspect of a variable. Each binding of a variable to a value can have its own extent at runtime. The extent of the binding is the portion of the program's execution time during which the variable continues to refer to the same value or memory location. A running program may enter and leave a given extent many times, as in the case of a closure.

In portions of code, a variable in scope may never have been given a value, or its value may have been destroyed. Such variables are described as "out of extent" or "unbound". In many languages, it is an error to try to use the value of a variable when it is out of extent. In other languages, doing so may yield unpredictable results. Such a variable may, however, be assigned a new value, which gives it a new extent. By contrast, it is permissible for a variable binding to extend beyond its scope, as occurs in Lisp closures and C static variables. When execution passes back into the variable's scope, the variable may once again be used.

For space efficiency, a memory space needed for a variable may be allocated only when the variable is first used and freed when it is no longer needed. A variable is only needed when it is in scope, but beginning each variable's lifetime when it enters scope may give space to unused variables. To avoid wasting such space, compilers often warn programmers if a variable is declared but not used.

It is considered good programming practice to make the scope of variables as narrow as feasible so that different parts of a program do not accidentally interact with each other by modifying each other's variables. Doing so also prevents action at a distance. Common techniques for doing so are to have different sections of a program use different namespaces, or to make individual variables "private" through either dynamic variable scoping or lexical variable scoping.

Many programming languages employ a reserved value (often named null or nil) to indicate an invalid or uninitialized variable.

Typed and untyped variables
In statically-typed languages such as Java or ML, a variable also has a type, meaning that only values of a given class (or set of classes) can be stored in it. A variable of a primitive type holds a value of that exact primitive type. A variable of a class type can hold a null reference or a reference to an object whose type is that class type or any subclass of that class type. A variable of an interface type can hold a null reference or a reference to an instance of any class that implements the interface. A variable of an array type can hold a null reference or a reference to an array.

In dynamically-typed languages such as Python, it is values, not variables, which carry type. In Common Lisp, both situations exist simultaneously: a variable is given a type (if undeclared, it is assumed to be, the universal supertype) which exists at compile time. Values also have types, which can be checked and queried at runtime. See type system.

Typing of variables also allows polymorphisms to be resolved at compile time. However, this is different from the polymorphism used in object-oriented function calls (referred to as virtual functions in C++) which resolves the call based on the value type as opposed to the supertypes the variable is allowed to have.

Variables often store simple data-like integers and literal strings, but some programming languages allow a variable to store values of other datatypes as well. Such languages may also enable functions to be parametric polymorphic. These functions operate like variables to represent data of multiple types. For example, a function named  may determine the length of a list. Such a  function may be parametric polymorphic by including a type variable in its type signature, since the amount of elements in the list is independent of the elements' types.

Parameters
The formal parameters of functions are also referred to as variables. For instance, in this Python code segment,

and its equivalent code segment in Lisp,

the variable named  is a parameter because it is given a value when the function is called. The integer 5 is the argument which gives  its value. In most languages, function parameters have local scope. This specific variable named  can only be referred to within the   function (though of course other functions can also have variables called  ).

Memory allocation
The specifics of variable allocation and the representation of their values vary widely, both among programming languages and among implementations of a given language. Many language implementations allocate space for local variables, whose extent lasts for a single function call on the call stack, and whose memory is automatically reclaimed when the function returns. (More generally, in name binding, the name of a variable is bound to the address of some particular block (contiguous sequence) of bytes in memory, and operations on the variable manipulate that block. Referencing is more common for variables whose values have large or unknown sizes when the code is compiled. Such variables reference the location of the value instead of the storing value itself, which is allocated from a pool of memory called the heap.

Bound variables have values. A value, however, is an abstraction, an idea; in implementation, a value is represented by some data object, which is stored somewhere in computer memory. The program, or the runtime environment, must set aside memory for each data object and, since memory is finite, ensure that this memory is yielded for reuse when the object is no longer needed to represent some variable's value.

Objects allocated from the heap must be reclaimed specially when the objects are no longer needed. In a garbage-collected language (such as C#, Java, and Lisp), the runtime environment automatically reclaims objects when extant variables can no longer refer to them. In non-garbage-collected languages, such as C, the program (and thus the programmer) must explicitly allocate memory, and then later free it, to reclaim its memory. Failure to do so leads to memory leaks, in which the heap is depleted as the program runs, risking eventual failure from exhausting available memory.

When a variable refers to a data structure created dynamically, some of its components may be only indirectly accessed through the variable. In such circumstances, garbage collectors (or analogous program features in languages that lack garbage collectors) must deal with a case where only a portion of the memory reachable from the variable needs to be reclaimed.

Constants
A constant is a datum whose value cannot be changed once it is initially bound to a value. In other words, constants cannot be assigned to. In purely functional programming, all data are constant, because there is no assignment.

Although a constant value is specified only once, the constant can be referenced multiple times in a program. Using a constant instead of specifying a value multiple times in the program can not only simplify code maintenance, but it can also supply a meaningful name for it and consolidate such constant assignments to a standard code location (for example, at the beginning).

Programming languages provide one of two kinds of constant variables:
 * Static constant or Manifest constant: Languages such as Visual Basic allow assigning a fixed value to static constant which will be known at compile time. Such a constant has the same value each time its program runs. Changing the value is accomplished by changing (and possibly recompiling) the code. E.g.:.


 * Dynamic constant: Languages such as C++ and Java allow initializing a dynamic constant with a value that is computed at runtime. Thus, unlike static constants, the values of dynamic constants cannot be determined at compile time. E.g.:.

For variables which are references, do not confuse constant references with immutable objects. For example, when a non-constant reference references an immutable object, that reference can be changed so that it references a different object, but the object it originally pointed to cannot be changed (i.e. other references that reference it still see the same information).

Conversely, a constant reference may reference a mutable object. In this case, the reference will always reference the same object (the reference cannot be changed); however, the object that the reference references can still be changed (and other references that also reference that object will see the change), as shown in the following example: The above code produces the following output:

InitialValueOfDynamicConstant_AppendedText

In languages where a variable can be an object (i.e. C++), such a variable being constant is equivalent to the immutability of that object.

Variable interpolation
Variable interpolation (also variable substitution, variable expansion) is the process of evaluating an expression or string literal containing one or more variables, yielding a result in which the variables are replaced with their corresponding values in memory. It is a specialized instance of concatenation.

Languages that support variable interpolation include Perl, PHP, Ruby, and most Unix shells. In these languages, variable interpolation only occurs when the string literal is double-quoted, but not when it is single-quoted. The variables are recognized because variables start with a sigil (typically "$") in these languages. Ruby uses the "#" symbol for interpolation, and lets you interpolate any expression, not just variables.

For example, the following Perl code: produces the output: Nancy said Hello World to the crowd of people.