Quasispecies model

The quasispecies model is a description of the process of the Darwinian evolution of self-replicating entities within the framework of physical chemistry. It is useful mainly in providing a qualitative understanding of the evolutionary processes of self-replicating macromolecules such as RNA or DNA or simple asexual organisms such as bacteria or viruses (see also viral quasispecies), and is helpful in explaining something of the early stages of the origin of life. Quantitative predictions based on this model are difficult because the parameters that serve as its input are hard to obtain from actual biological systems. The quasispecies model was put forward by Manfred Eigen and Peter Schuster (see note 1) based on initial work done by Eigen(see note 2).

General description
The model rests on four assumptions:
 * 1) The self-replicating entities can be represented as sequences composed of a small number of building blocks--for example, sequences of RNA consisting of the four bases adenine, guanine, cytosine, and uracil.
 * 2) New sequences enter the system solely as the result of a copy process, either correct or erroneous, of other sequences that are already present.
 * 3) The substrates, or raw materials, necessary for ongoing replication are always present in sufficient quantity. Excess sequences are washed away in an outgoing flux.
 * 4) Sequences may decay into their building blocks. The probability of decay does not depend on the sequences' age; old sequences are just as likely to decay as young sequences.

In the quasispecies model, mutations occur through errors made in the process of copying already existing sequences. Further, selection arises because different types of sequences tend to replicate at different rates, which leads to the suppression of sequences that replicate more slowly in favor of sequences that replicate faster. However, the quasispecies model does not predict the ultimate extinction of all but the fastest replicating sequence. Although the sequences that replicate more slowly cannot sustain their abundance level by themselves, they are constantly replenished as sequences that replicate faster mutate into them. At equilibrium, removal of slowly replicating sequences due to decay or outflow is balanced by replenishing, so that even relatively slowly replicating sequences can remain present in finite abundance.

Due to the ongoing production of mutant sequences, selection does not act on single sequences, but on mutational "clouds" of closely related sequences, referred to as quasispecies. In other words, the evolutionary success of a particular sequence depends not only on its own replication rate, but also on the replication rates of the mutant sequences it produces, and on the replication rates of the sequences of which it is a mutant. As a consequence, the sequence that replicates fastest may even disappear completely in selection-mutation equilibrium, in favor of more slowly replicating sequences that are part of a quasispecies with a higher average growth rate (see note 3). Mutational clouds as predicted by the quasispecies model have been observed in RNA viruses and in in vitro RNA replication (see note 4).

The mutation rate and the general fitness of the molecular sequences and their neighbors is crucial to the formation of a quasispecies. If the mutation rate is zero, there is no exchange by mutation, and each sequence is its own species. If the mutation rate is too high, exceeding what is known as the error threshold, the quasispecies will break down and be dispersed over the entire range of available sequences.

Mathematical description
A simple mathematical model for a quasispecies is as follows: let there be $$S$$  possible sequences and let there be $$n_i$$ organisms with sequence i. Let's say that each of these organisms asexually gives rise to $$A_i$$ offspring. Some are duplicates of their parent, having sequence i, but some are mutant, and have some other sequence. Let the mutation rate $$q_{ij}$$ correspond to the probability that a j type parent will produce an i type organism. Then the expected number of i type organisms produced by any j type parent is $$w_{ij}=A_j q_{ij}$$. Then the total number of i-type organisms after the first round of reproduction, given as $$n'_i$$, is


 * $$n'_i=\sum_j w_{ij}n_j\,$$

where $$\sum_i q_{ij}=1\,$$. Sometimes a death rate term $$D_i$$ is included so that:


 * $$w_{ij}=A_j q_{ij}-D_i\delta_{ij}\,$$

where $$\delta_{ij}$$ is equal to 1 when i=j and is zero otherwise. Note that the n-th generation can be found by just taking the n-th power of w substituting it in place of w in the above formula.

This is just a system of linear equations. The usual way to solve such a system is to first diagonalize the w matrix. Its diagonal entries will be eigenvalues corresponding to certain linear combinations of certain subsets of sequences which will be eigenvectors of the w matrix. These subsets of sequences are the quasispecies. After very many generations, only the eigenvector with the largest eigenvalue will prevail, and it is this quasispecies that will eventually dominate. The components of this eigenvector give the relative abundance of each sequence at equilibrium.

Alternative formulations
The quasispecies formulae may be expressed as a set of linear differential equations. Setting $$\dot{n}_i\approx n'_i-n_i$$ we can write:


 * $$\dot{n}_i=\sum_j w_{ij}n_j-n_i\,$$

The quasispecies equations are usually expressed in terms of concentrations $$x_i$$ where


 * $$x_i\ \stackrel{\mathrm{def}}{=}\ \frac{n_i}{\sum_j n_j}$$.
 * $$x'_i\ \stackrel{\mathrm{def}}{=}\ \frac{n'_i}{\sum_j n'_j}$$.

The above equations for the quasispecies then become for the discrete version:


 * $$x'_i = \frac{\sum_j w_{ij}x_j}{\sum_{ij} w_{ij}x_j}$$

or, for the continuum version:


 * $$\dot{x}_i =\sum_j w_{ij}x_j-x_i\sum_{ij}w_{ij}x_j.$$

A simple example
The quasispecies concept can be illustrated by a simple system consisting of 4 sequences. Sequence 1 is [0,0], and sequences [0,1], [1,0] and [1,1] are numbered 2,3 and 4 respectively. Lets say the [0,0] sequence never mutates and always produces a single offspring. Lets say the other 3 sequences all produce, on average, $$1-k$$ replicas of themselves, and $$k$$ of each of the other two types, where $$0\le k\le 1$$. The w matrix is then:


 * $$\mathbf{w}=

\begin{bmatrix} 1&0&0&0\\ 0&1-k&k&k\\ 0&k&1-k&k\\ 0&k&k&1-k \end{bmatrix} $$

The diagonalized matrix is


 * $$\mathbf{W}=

\begin{bmatrix} 1-2k&0&0&0\\ 0&1-2k&0&0\\ 0&0&1&0\\ 0&0&0&1+k \end{bmatrix} $$

and the eigenvectors corresponding to these eigenvalues are:


 * {| class="wikitable"


 * Eigenvalue ||Eigenvector
 * 1-2k || [0,-1,0,1]
 * 1-2k || [0,-1,1,0]
 * 1   || [1,0,0,0]
 * 1+k || [0,1,1,1]
 * }
 * 1   || [1,0,0,0]
 * 1+k || [0,1,1,1]
 * }
 * }

Only the eigenvalue $$1+k$$ is larger than unity. For the n-th generation, the corresponding eigenvalue will be $$(1+k)^n$$ and so will increase without bound as time goes by. This eigenvalue corresponds to the eigenvector [0,1,1,1], which represents the quasispecies consisting of sequences 2, 3, and 4, which will be present in equal numbers after a very long time. Since all population numbers must be positive, the first two quasispecies are not legitimate. The third quasispecies consists of only the non-mutating sequence 1. Its seen that even though sequence 1 is the most fit in the sense that it reproduces more of itself than any other sequence, the quasispecies consisting of the other three sequences will eventually dominate.