Molecular clock

The molecular clock (based on the molecular clock hypothesis (MCH)) is a technique in genetics, which researchers use to date when two species diverged. It deduces elapsed time from the number of minor differences between their DNA sequences. It is sometimes called a gene clock.

History
The notion of the existence of a so-called "molecular clock" was first attributed to Emile Zuckerkandl and Linus Pauling who, in 1962, noticed that the number of amino acid differences in hemoglobin between lineages scales roughly with divergence times, as estimated from fossil evidence. They generalized this observation to assert that the rate of evolutionary change of any specified protein was approximately constant over time and over different lineages. It has been applied to DNA sequence evolution also, particularly neutral evolution. The molecular clock hypothesis is also commonly used to explain the remarkable molecular equidistance phenomenon, as shown here.

Later Allan Wilson and Vincent Sarich built upon this work and the work of Motoo Kimura observed and formalized that rare spontaneous errors in DNA replication cause the mutations that drive molecular evolution, and that the accumulation of evolutionarily "neutral" differences between two sequences could be used to measure time, if the error rate of DNA replication could be calibrated. One method of calibrating the error rate was to use as references pairs of groups of living species whose date of speciation was already known from the fossil record.

Calibration
Originally, it was assumed that the DNA replication error rate was constant – not just over time, but across all species and every part of a genome that you might want to compare. Because the enzymes that replicate DNA differ only very slightly between species, the assumption might have seemed reasonable a priori. It is fundamentally flawed logically however, because the strength of natural selection is not uniform in time, space, and across taxa. Had either Pauling or Zuckerkandl been evolutionary biologists rather than in vitro biologists (or in Pauling's case, not a biologist at all), this error would probably have caught their attention. But there was simply insufficient overlap between the fields of evolutionary and molecular biology in their day to bring this problem to widespread notice. Thus only as molecular evidence accumulated, the constant-rate assumption has proven false. Without the constant-rate assumption, the long-held molecular clock explanation of the molecular equidistance phenomenon becomes untenable. While the MCH cannot be blindly assumed to be true, individual molecular clocks can be tested for accuracy and utilized in many cases. In general terms, they need to be calibrated against material evidence such as fossils before firm conclusions can be based on them (see also Lovette ).

Since at least the early 1990s, examples of non-uniform rates of molecular evolution have been described. It is known for many taxa that there is no uniform rate of molecular evolution , not even over comparatively short periods of evolutionary time (for example mockingbirds ). Tube-nosed seabirds apparently have a molecular clock that on average runs at half speed compared to many other birds , possibly due to long generation times, whereas many turtles have a molecular clock running at one-eighth the speed it does in small mammals or even slower. Effects of small population size are also likely to confound molecular clock analyses; cheetahs for example, having gone through at least 2 population bottlenecks, could not be adequately studied based on a molecular clock model alone. Researchers like Ayala and the anthropologist Jeffrey H. Schwartz in 2006 have more fundamentally challenged the molecular clock hypothesis. According to Ayala's 1999 study, 5 factors combine to invalidate the standard molecular clock model:


 * Changing generation times (A mutation generally becomes fixed only from one generation to another. The shorter this timespan is, the more mutations can become fixed)
 * Population size (Apart from effects of small population size, genetic diversity will "bottom out" as populations become larger as the fitness advantage of any one mutation becomes smaller)
 * Species-specific differences (due to differing metabolism, ecology, evolutionary history,...)
 * Evolving functions of the encoded protein (can be ameliorated by utilizing non-coding DNA sequences or emphasizing silent mutations)
 * Changes in the intensity of natural selection

Molecular clock users have developed workaround solutions using a number of statistical approaches including maximum likelihood techniques and later Bayesian modeling. In particular, models that take into account rate variation across lineages have been proposed in order to obtain better estimates of divergence times (and other parameters that may be estimated from substitution rates, such as effective population size.) These models are called relaxed molecular clocks because they represent an intermediate position between the 'strict' molecular clock hypothesis and Felsenstein's many-rates model and are made possible through MCMC techniques that explore a weighted range of tree topologies and simultaneously estimate parameters of the chosen substitution model. It must be remembered that these are still based on statistical inference and not on direct evidence and that therefore, strictly speaking even a relaxed molecular clock can only support but never prove a scientific hypothesis. This problem is approached by using the fossil record, which quite often is good and well-documented enough to provide hard evidence, to calibrate the molecular clock accordingly. Alternatively, for viral phylogenetics and ancient DNA studies, two areas of evolutionary biology where it is possible to sample sequences over an evolutionary timescale, the dates of the sequence themselves can be used to calibrate the molecular clock.

Uses
The molecular clock technique is an important tool in molecular systematics, the use of molecular genetics information to determine the correct scientific classification of organisms.

Knowledge of approximately-constant rate of molecular evolution in particular sets of lineages also facilitates establishing the dates of phylogenetic events not documented by fossils, such as the divergence of living taxa and the formation of the phylogenetic tree. But in these cases - especially over long stretches of time - the MCH can be considered null and void for practical purposes; such estimates are inevitably very crude and may be off by 50% or more.