Chinese restaurant process

In probability theory, the Chinese restaurant process is a discrete-time stochastic process, whose value at any positive-integer time n is a partition Bn of the set {1, 2, 3, ..., n} whose probability distribution is determined as follows. At time n + 1 the element n + 1 is either: 1) added to one of the blocks of the partition Bn, where each block is chosen with probability nk/(n + 1) where nk is the size of the block, or 2) added to the partition Bn as a new singleton block, with probability 1/(n + 1). The random partition so generated is exchangeable in the sense that relabeling {1, ..., n} does not change the distribution of the partition, and it is consistent in the sense that the law of the partition of n &minus; 1 obtained by removing the element n from the random partition at time n is the same as the law of the random partition at time n &minus; 1.

One fancifully imagines a Chinese restaurant with an infinite number of circular tables, each with infinite capacity. At time n + 1, a new customer chooses uniformly at random to sit at one of the following n + 1 places: directly to the left of one of the n customers already sitting at an occupied table, or at a new, unoccupied circular table. Clearly, each table corresponds to a block of a random partition. (For an account of the custom in actual Chinese restaurants, see table sharing). It is then immediate that the probability assigned to any particular partition (ignoring the order in which customers sit around any particular table) is



\Pr(B_n = B) = { \prod_{b\in B} (|b| -1)! \over n!} $$

where b is a block in the partition B and |b| is the size (i.e. number of elements) of b. The restaurant analogy is due to Pitman and Dubins, and first appears in print in.

This setup can be generalized to a model with two parameters, &alpha; and &theta;. At time n + 1, the next customer to arrive decides to sit at an occupied table with probability



{n_k - \alpha  \over n + \theta} $$

if nk is the number of people currently sitting at the table, and sits at an empty table with probability



{\theta + |B| \alpha \over n + \theta} $$

if |B|n is the number of occupied tables at time n. To satisfy the rules of probability it is necessary to suppose that either &alpha; < 0 and &theta; = L&alpha; for some L &isin; {1, 2, ...}; or that 0 &le; &alpha; &le; 1  and &theta; > &minus;&alpha;.

Under this model the probability assigned to any particular partition B of n is



\Pr(B_n = B) = {(\theta + \alpha)_{k+1, \alpha} \prod_{b\in B}(1-\alpha)_{|b|-1, 1} \over (\theta+1)_{n-1, 1} } $$

where



(x)_{y,z} = \prod_{i=0}^{y-1}(x+iz). $$

As before, the probability assigned to any particular partition depends only on the block sizes, so as before the random partition is exchangeable in the sense described above. The consistency property still holds, as before, by construction.

If $$ \alpha =0 $$, the probability distribution of the random partition of the integer n thus generated is the Ewens distribution with parameter &theta;, used in population genetics and the unified neutral theory of biodiversity.

The Chinese restaurant process is closely connected to Dirichlet processes and Polya's urn scheme. They are nonparametric Bayesian methods. They have been used in many applications, including modeling text, clustering biological microarray data, and detecting objects in images.

Derivation
Here is one way to derive this partition probability. Let Ci be the random block into which the number i is added, for i = 1, 2, 3, ... . Then



\begin{align} \Pr(C_i = c|C_1,\ldots,C_{i-1}) & {} = \begin{cases} \dfrac{\theta + |B| \alpha }{\theta + i -1} & \text{if }c \in \text{new block}, \\ \\ \dfrac{|b| - \alpha }{\theta + i - 1} & \text{if }c\in b, \end{cases} \end{align} $$

where &delta; is the indicator function, i.e. &delta;(A) = 1 or 0 according as the event A occurs or fails.

The probability that Bn is any particular partition of the set { 1, ..., n } is the product of these probabilities as i runs from 1 to n. Now consider the size of block b: it increases by 1 each time we add one element into it. When the last element in block b is to be added in, the block size is (|b| &minus; 1). For example, consider this sequence of choices: (generate a new block b)(join b)(join b)(join b). In the end, block b has 4 elements and the product of the numerators in the above equation gets $$\scriptstyle\theta\cdot 1\cdot 2\cdot 3\,$$. Following this logic, we obtain Pr(Bn = B) as above.