Randomized algorithm

A randomized algorithm or probabilistic algorithm is an algorithm which employs a degree of randomness as part of its logic. In common practice, this means that the machine implementing the algorithm has access to a pseudorandom number generator. The algorithm typically uses the random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the "average case". Formally, the algorithm's performance will be a random variable determined by the random bits, with (hopefully) good expected value; this expected value is called the expected running time. The "worst case" is typically so unlikely to occur that it can be ignored.

Motivation
As a motivating example, consider the problem of finding an 'a' in an array of n elements, given that half are 'a's and the other half are 'b's. The obvious approach is to look at each element of the array, but this would take very long (n/2 operations) if the array were ordered as 'b's first followed by 'a's. There is a similar drawback with checking in the reverse order, or checking every second element. In fact, with any strategy at all in which the order in which the elements will be checked is fixed, i.e, a deterministic algorithm, we cannot guarantee that the algorithm will complete quickly for all possible inputs. On the other hand, if we were to check array elements at random, then we will quickly find an 'a' with high probability, whatever be the input.

Randomized algorithms are particularly useful when faced with a malicious "adversary" or attacker who deliberately tries to feed a bad input to the algorithm (see competitive analysis (online algorithm)). It is for this reason that randomness is ubiquitous in cryptography. In cryptographic applications, pseudo-random numbers cannot be used, since the adversary can predict them, making the algorithm effectively deterministic. Therefore either a source of truly random numbers or a cryptographically secure pseudo-random number generator is required. Another area in which randomness is inherent is quantum computing.

In the example above, the randomized algorithm always outputs the correct answer, it is just that there is a small probability of taking long to execute. Sometimes we want the algorithm to always complete quickly, but allow a small probability of error. The former type are called Las Vegas algorithms, and the latter are Monte Carlo algorithms (related to the Monte Carlo method for simulation). Observe that any Las Vegas algorithm can be converted into a Monte Carlo algorithm, by having it output an arbitrary, possibly incorrect answer if it fails to complete within a specified time.

Computational complexity
The computational complexity theory models randomized algorithms as probabilistic Turing machines. Both Las Vegas and Monte Carlo algorithms are considered, and several complexity classes are studied. The most basic randomized complexity class is RP, which is the class of decision problems for which there is an efficient (polynomial time) randomized algorithm (or probabilistic Turing machine) which recognizes NO-instances with absolute certainty and recognizes YES-instances with a probability of at least 1/2. The complement class is Co-RP, and the class of problems for which both YES and NO answers are allowed to be probabilistic is BPP. Problem classes having (possibly nonterminating) algorithms with polynomial time average case running time whose output is always correct are said to be in ZPP. For problems that (are believed to) lie outside these classes, such as NP-hard problems, even randomized algorithms do not suffice, and it becomes necessary to resort to approximation algorithms.

Historically, the first randomized algorithm was a method developed by Michael O. Rabin for the closest pair problem in computational geometry. The study of randomized algorithms was spurred by the 1977 discovery of a randomized primality test (i.e., determining the primality of a number) by Robert M. Solovay and Volker Strassen. Soon afterwards Michael O. Rabin demonstrated that the 1976 Miller's primality test can be turned into a randomized algorithm. At that time, no practical deterministic algorithm for primality was known.

The Miller-Rabin primality test relies on a binary relation between two positive integers k and n that can be expressed by saying that k "is a witness to the compositeness of" n. It can be shown that Observe that this implies that the primality problem is in Co-RP.
 * If there is a witness to the compositeness of n, then n is composite (i.e., n is not prime), and
 * If n is composite then at least three-fourths of the natural numbers less than n are witnesses to its compositeness, and
 * There is a fast algorithm that, given k and n, ascertains whether k is a witness to the compositeness of n.

If one randomly chooses 100 numbers less than a composite number n, then the probability of failing to find such a "witness" is (1/4)100 so that for most practical purposes, this is a good primality test. If n is big, there may be no other test that is practical. The probability of error can be reduced to an arbitrary degree by performing enough independent tests.

Therefore, in practice, there is no penalty associated with accepting a small probability of error, since with a little care the probability of error can be made astronomically small. Indeed, even though a deterministic polynomial-time primality test has since been found, it has not replaced the older probabilistic tests in cryptographic software nor is it expected to do so for the foreseeable future.

If, using a randomized method, the probability of error is 2&minus;1000, the philosophical question arises: is this a mathematical proof? After all, the probability of the algorithm's error is distinctly smaller than the probability of an error in the computer hardware executing it, or the reader making an error in reading a proof - what does it mean in real terms to consider this small a probability?

Applications
Quicksort is a familiar, commonly-used algorithm in which randomness can be useful. The deterministic version of this algorithm requires O(n2) time to sort n numbers for some degenerate inputs, such as an array whose elements are already in sorted order. However, if the algorithm selects pivot elements uniformly at random, it has a very high probability of finishing in O(n log n) time.

Randomized algorithms can also be used to solve graph theory problems, as demonstrated by this randomized minimum cut algorithm:

find_min_cut(undirected graph G) { while there are more than 2 nodes in G do { pick an edge (u,v) at random in G        contract the edge, while preserving multi-edges remove all loops }    output the remaining edges }

Here, contracting an edge (u,v) means adding a new vertex w, replacing any edge (u,x) or (v,x) with (w,x), and then deleting u and v from G.

Let n = |V[G]|. It can be shown that this algorithm outputs a minimum cut with probability at least n-2, thus running it n2log(n) times and taking the smallest output cut, we find a minimum cut with high probability.

Derandomization
It is usually the case that randomized algorithms either are more elegant or require fewer resources than the corresponding deterministic algorithms. Removing randomization from randomized algorithms to build equivalently powerful deterministic algorithms is an active field of research in computer science. This general method is central in resolving many complexity classes describing randomized computation.