Human genetic variation

Human genetic variation is the natural variation in gene frequencies observed between the genomes of individuals or groups of humans. Variation occurs at both the individual level (differences between individual people) and at the geographic level, i.e. differences between groups of people living in different parts of the world (ethnic groups, races).

In genetics there may be multiple variants of any given gene (polymorphism), these are called alleles. Any individual human has only two copies of any given allele, one inherited from their mother and the other from their father, but many more different versions of the gene may exist. Genetic variation is also distributed geographically, the frequency of any given allele may be greater in humans from on geographic region than in humans from some other region.

There are at least two reasons why genetic variation is geographically distributed: The study of human geographic variation has both evolutionary significance and medical applications. The study can help scientists understand ancient human population migrations as well as how different human groups are biologically related to one another. From a medical perspective the study of human genetic variation may be important because some disease causing alleles occur at a greater frequency in people from specific geographic regions.
 * natural selection may confer an adaptive advantage to individuals in a specific environment, for example dark skin pigmentation protects from high levels of ultraviolet radiation, whereas a low level of melanin in the skin may confer an advantage in regions with low levels of UV light. Alleles under selection are likely to occur only in those geographic regions where they confer an advantage.
 * The second main cause of geographically distributed genetic variation is due to non-uniform sampling of a population. The main cause is founder effect, this is the effect of a small group of individuals migrating from a larger group and founding a new population, if the migrating population represents only a small subset of the parental population, then it will not be genetically representative of the parental population (sampling error). Small founding populations are also subject to genetic drift, which may further alter allele frequencies. An example of this is the human migration out of Africa, it has been theorised that the migration out of Africa only represented a small fraction of the genetic variation in East Africa, and that this is the cause of the observed lower levels of diversity in all indigenous non-African humans. More recent neutral polymorphisms caused by mutation are likely to be relatively geographically localised, while older polymorphisms are more likely to be shared by all human groups. A large majority of the observed genetic variation is nevertheless distributed within any geographic region rather than between regions, though it is usually possible to accurately identify the geographic origins of any individual's ancestors by genetic means.

Distribution of variation
The differences in many patterns of genetic variation between humans and other species awaits additional genetic studies of human populations and nonhuman species. But the data gathered to date suggest that human variation exhibits several distinctive characteristics. First, compared with many other mammalian species, humans are genetically less diverse—a counterintuitive finding, given our large population and worldwide distribution (Li and Sadler 1991; Kaessmann et al. 2001). For example, the chimpanzee subspecies living just in central and western Africa have higher levels of diversity than do humans (Ebersberger et al. 2002; Yu et al. 2003; Fischer et al. 2004).

Two random humans are expected to differ at approximately 1 in 1000 nucleotides, whereas two random chimpanzees differ at 1 in 500 nucleotide pairs. However, with a genome of approximate 3 billion nucleotides, on average two humans differ at approximately 3 million nucleotides. Most of these single nucleotide polymorphisms (SNPs) are neutral, but some are functional and influence the phenotypic differences between humans. It is estimated that about 10 million SNPs exist in human populations, where the rarer SNP allele has a frequency of at least 1% (see International HapMap Project).

The distribution of variants within and among human populations also differs from that of many other species. The details of this distribution are impossible to describe succinctly because of the difficulty of defining a "population," the clinal nature of variation, and heterogeneity across the genome (Long and Kittles 2003). In general, however, 5%–15% of genetic variation occurs between large groups living on different continents, with the remaining majority of the variation occurring within such groups (Lewontin 1972; Jorde et al. 2000a; Hinds et al. 2005). This distribution of genetic variation differs from the pattern seen in many other mammalian species, for which existing data suggest greater differentiation between groups (Templeton 1998; Kittles and Weiss 2003).

Our history as a species also has left genetic signals in regional populations. For example, in addition to having higher levels of genetic diversity, populations in Africa tend to have lower amounts of linkage disequilibrium than do populations outside Africa, partly because of the larger size of human populations in Africa over the course of human history and partly because the number of modern humans who left Africa to colonize the rest of the world appears to have been relatively low (Gabriel et al. 2002). In contrast, populations that have undergone dramatic size reductions or rapid expansions in the past and populations formed by the mixture of previously separate ancestral groups can have unusually high levels of linkage disequilibrium (Nordborg and Tavare 2002).

In the field of population genetics, it is believed that the distribution of neutral polymorphisms among contemporary humans reflects human demographic history. It is believed that humans passed through a population bottleneck before a rapid expansion coinciding with migrations out of Africa leading to an African-Eurasian divergence around 100,000 years ago (ca. 5,000 generations), followed by a European-Asian divergence about 40,000 years ago (ca. 2,000 generations). Richard G. Klein, Nicholas Wade and Spencer Wells, among others, have postulated that modern humans did not leave Africa and successfully colonize the rest of the world until as recently as 60,000 - 50,000 years B.P., pushing back the dates for subsequent population splits as well.

The rapid expansion of a previously small population has two important effects on the distribution of genetic variation. First, the so-called founder effect occurs when founder populations bring only a subset of the genetic variation from their ancestral population. Second, as founders become more geographically separated, the probability that two individuals from different founder populations will mate becomes smaller. The effect of this assortative mating is to reduce gene flow between geographical groups, and to increase the genetic distance between groups. The expansion of humans from Africa affected the distribution of genetic variation in two other ways. First, smaller (founder) populations experience greater genetic drift because of increased fluctuations in neutral polymorphisms. Second, new polymorphisms that arose in one group were less likely to be transmitted to other groups as gene flow was restricted.

Many other geographic, climatic, and historical factors have contributed to the patterns of human genetic variation seen in the world today. For example, population processes associated with colonization, periods of geographic isolation, socially reinforced endogamy, and natural selection all have affected allele frequencies in certain populations (Jorde et al. 2000b; Bamshad and Wooding 2003). In general, however, the recency of our common ancestry and continual gene flow among human groups have limited genetic differentiation in our species.

Substructure in the human population


New data on human genetic variation has reignited the debate surrounding race. Most of the controversy surrounds the question of how to interpret this new data, and whether conclusions based on existing data are sound. A large majority of researchers endorse the view that continental groups do not constitute different subspecies. However, other researchers still debate whether evolutionary lineages should rightly be called "races". These questions are particularly pressing for biomedicine, where self-described race is often used as an indicator of ancestry (see race in biomedicine below).

Although the genetic differences among human groups are relatively small, these differences in certain genes such as duffy, ABCC11, SLC24A5, called ancestry-informative markers (AIMs) nevertheless can be used to reliably situate many individuals within broad, geographically based groupings or self-identified race. For example, computer analyses of hundreds of polymorphic loci sampled in globally distributed populations have revealed the existence of genetic clustering that roughly is associated with groups that historically have occupied large continental and subcontinental regions (Rosenberg et al. 2002; Bamshad et al. 2003).

Some commentators have argued that these patterns of variation provide a biological justification for the use of traditional racial categories. They argue that the continental clusterings correspond roughly with the division of human beings into sub-Saharan Africans; Europeans, Western Asians, Southern Asians and Northern Africans + Eastern Asians, Southeast Asians, Polynesians and Native Americans; and other inhabitants of Oceania (Melanesians, Micronesians & Australian Aborigines) (Risch et al. 2002). Other observers disagree, saying that the same data undercut traditional notions of racial groups (King and Motulsky 2002; Calafell 2003; Tishkoff and Kidd 2004). They point out, for example, that major populations considered races or subgroups within races do not necessarily form their own clusters. Thus, samples taken from India and Pakistan affiliate with Europeans or eastern Asians rather than separating into a distinct cluster.

Furthermore, because human genetic variation is clinal, many individuals affiliate with two or more continental groups. Thus, the genetically based "biogeographical ancestry" assigned to any given person generally will be broadly distributed and will be accompanied by sizable uncertainties (Pfaff et al. 2004).

In many parts of the world, groups have mixed in such a way that many individuals have relatively recent ancestors from widely separated regions. Although genetic analyses of large numbers of loci can produce estimates of the percentage of a person's ancestors coming from various continental populations (Shriver et al. 2003; Bamshad et al. 2004), these estimates may assume a false distinctiveness of the parental populations, since human groups have exchanged mates from local to continental scales throughout history (Cavalli-Sforza et al. 1994; Hoerder 2002). Even with large numbers of markers, information for estimating admixture proportions of individuals or groups is limited, and estimates typically will have wide confidence intervals or CIs (Pfaff et al. 2004).

Variation in phenotype
The distribution of many physical traits resembles the distribution of genetic variation within and between human populations (American Association of Physical Anthropologists 1996; Keita and Kittles 1997). For example, ~90% of the variation in human head shapes occurs within continental groups, and ~10% separates groups, with a greater variability of head shape among individuals with recent African ancestors (Relethford 2002).

Variation in a trait under selection, skin colour
A prominent exception to the common distribution of physical characteristics within and among groups is skin color. Approximately 10% of the variance in skin color occurs within groups, and ~90% occurs between groups (Relethford 2002). This distribution of skin color and its geographic patterning — with people whose ancestors lived predominantly near the equator having darker skin than those with ancestors who lived predominantly in higher latitudes — indicate that this attribute has been under strong selective pressure. Darker skin appears to be strongly selected for in equatorial regions to prevent sunburn, skin cancer, the photolysis of folate, and damage to sweat glands (Sturm et al. 2001; Rees 2003). A leading hypothesis for the selection of lighter skin in higher latitudes is that it enables the body to form greater amounts of vitamin D, which helps prevent rickets (Jablonski 2004). Evidence for this includes the finding that a substantial portion of the differences of skin color between Europeans and Africans resides in a single gene, SLC24A5 the threonine-111 allele of which was found in 98.7 to 100% among several European samples, while the alanine-111 form was found in 93 to 100% of samples of Africans, East Asians and Indigenous Americans (Lamason et al. 2005). However, the vitamin D hypothesis is not universally accepted (Aoki 2002), and lighter skin in high latitudes may correspond simply to an absence of selection for dark skin (Harding et al. 2000). Melanin which serves as the pigment, is located in the epidermis of the skin, and is based on hereditary gene expression.

Because skin color has been under strong selective pressure, similar skin colors can result from convergent adaptation rather than from genetic relatedness. Sub-Saharan Africans, tribal populations from southern India, and Indigenous Australians have similar skin pigmentation, but genetically they are no more similar than are other widely separated groups. Furthermore, in some parts of the world in which people from different regions have mixed extensively, the connection between skin color and ancestry has been substantially weakened (Parra et al. 2004). In Brazil, for example, skin color is not closely associated with the percentage of recent African ancestors a person has, as estimated from an analysis of genetic variants differing in frequency among continent groups (Parra et al. 2003).

Considerable speculation has surrounded the possible adaptive value of other physical features characteristic of groups, such as the constellation of facial features observed in many eastern and northeastern Asians (Guthrie 1996). However, any given physical characteristic generally is found in multiple groups (Lahr 1996), and demonstrating that environmental selective pressures shaped specific physical features will be difficult, since such features may have resulted from sexual selection for individuals with certain appearances or from genetic drift (Roseman 2004).