Genetic history of Europe

This article tries to describe demographic and genetic flows into and around European populations, as a product of human migrations.

Relation to other populations
A study by Luigi Luca Cavalli-Sforza of the Stanford University, School of Medicine, using 120 blood polymorphisms provides information on genetic relatedness of the various continental populations. Genetic distance is a measure used to quantify the genetic differences between two populations. It is based on the principle that two populations that share similar frequencies of a trait are more closely related than populations that have more divergent frequencies of a trait. In its simplest form it is the difference in frequencies of a particular trait between two populations. For example the frequency of RH negative individuals is 50.4% among Basques, 41.2% in France and 41.1% in England. Thus the genetic difference between the Basques and French is 9.2% and the genetic difference between the French and the English is 0.1% for the RH negative trait. Averaged over several traits this can give the overall genetic relatedness of various populations.

According to the study all non-African populations are more closely related to each other than to Africans consistent with the hypothesis that all non-Africans are descended from a single African population. Europeans are most closely related to East Asians and least related to Africans. However of all the non-African populations, Europeans are most closely related to Africans. As the genetic distance from Africa to Europe (16.6) is shorter than the genetic distance from Africa to East Asia (20.6) and even much shorter than the Genetic distance from Africa to Australia. Cavalli-Sforza proposes that the simplest explanation for this short genetic distance is that substantial gene exchange has taken place between the nearby continents. Cavalli-Sforza also proposes that both Asian and African populations contributed to the settlement of Europe which began 40 000 years ago. The overall contributions from Asia and Africa were estimated to be around two-thirds and one-third, respectively. Europe has a genetic variation in general of about a third of that of other continents.

European population substructure


In 2006, an autosomal analysis comparing samples from various European populations concluded that “there is a consistent and reproducible distinction between ‘northern’ and ‘southern’ European population groups”. Most individual participants with southern European ancestry (Italian, Greek, Portuguese, and Spanish) have >85% membership in the ‘southern’ population; and most northern, western, eastern, and central Europeans have >90% in the ‘northern’ population group. Ashkenazi Jewish as well as Sephardic Jewish origin also showed >85% membership in the ‘southern’ population, consistent with a later Mediterranean origin of these ethnic groups."

Somewhat contradicting these findings, a similar 2007 study found that the most important genetic differentiation in Europe occurs on a line from the north to the south-east (northern Europe to the Balkans), with another east-west axis of differentiation across Europe. Its findings were consistent with earlier mtDNA and Y-chromosonal based results supporting the theory that modern Iberians hold the most ancient European genetic ancestry, as well as separating Basques and Sami from other European populations. It confirmed that the English and Irish cluster with other Northern and Eastern Europeans such as Germans and Poles while some Basque and Italian individuals also clustered with Northern Europeans. Despite these stratifications it noted the unusually high degree of European homogeneity: "there is low apparent diversity in Europe with the entire continent-wide samples only marginally more dispersed than single population samples elsewhere in the world."

Human Y-chromosome DNA haplogroups
There are three major Y-chromosome DNA haplogroups which largely account for most of Europe's present-day population.
 * Haplogroup R1b is common on the western Atlantic coast of Europe, from the Iberian Peninsula (comprising Spain and Portugal) to Ireland, Wales, England and Scotland.
 * Haplogroup I is common across Central Europe and up into Scandinavia.
 * Haplogroup R1a is common in Eastern Europe (and has also spread across into Central Asia and as far as the Indian subcontinent).

Most common of all haplogroups among western Europeans is R1b. The exact following values of Hg R1b are: Basques: 88.1%; Irish: 81.5%; Welsh: 89.0%; Scots: 77.1%; Non-Basque Spaniards: 68.0 (Catalans: 79.2; Andalusians: 65.5); Portuguese(South): 56.0%; Portuguese (North): 62.0%; British: 68.8; English (Central): 61.9% Belgians: 63.0; French: 52.2; Danes: 41.7%, Norwegian: 25.9; Swedish: 20.0; German: 47.9; Italian (Calabria): 32.4; Italian (Sardinia): 22.1%; Italian (North-central): 62.0. In conclusion, the so-called R1b genetic family is the most numerous in Western Europe. Each haplogroup also have subclades. R1a and R1b are subclades of Haplogroup R (Y-DNA) Two main subgroups of Haplogroup I (Y-DNA) are I-M253/I-M307/I-P30/I-P40 which "has highest frequency in Scandinavia, Iceland, and northwest Europe." The other is I-S31 which "includes I-P37.2, which is the most common form in the Balkans and Sardinia, and I-S23/I-S30/I-S32/I-S33, which reaches its highest frequency along the northwest coast of continental Europe."

There is an ongoing debate regarding Neolithic Europe, with evidence both for and against a demic diffusion from the Near East: "genetic studies have failed to settle the controversy so far, because they have been interpreted in different ways... A rather heated debate followed, and is still continuing."

Also, around 4,500 years ago, Haplogroup N3 began moving across from west of the Ural mountains, and seems to follow closely the spread of the Finno-Ugric languages.

Human mitochondrial DNA haplogroups
About mitochondrial DNA haplogroups (mtDNA), according to University of Oulu Library (Finland):

"Classical polymorphic markers (i.e. blood groups, protein electromorphs and HLA antigenes) have suggested that Europe is a genetically homogeneous continent with a few outliers such as the Saami, Sardinians, Icelanders and Basques (Cavalli-Sforza et al. 1993, Piazza 1993). The analysis of mtDNA sequences has also shown a high degree of homogeneity among European populations, and the genetic distances have been found to be much smaller than between populations on other continents, especially Africa (Comas et al. 1997)."

"The mtDNA haplogroups of Europeans are surveyed by using a combination of data from RFLP analysis of the coding region and sequencing of the hypervariable segment I. About 99% of European mtDNAs fall into one of ten haplogroups: H, I, J, K, M, T, U, V, W or X (Torroni et al. 1996a). Each of these is defined by certain relatively ancient and stable polymorphic sites located in the coding region (Torroni et al. 1996a)... Haplogroup H, which is defined by the absence of a AluI site at bp 7025, is the most prevalent, comprising half of all Europeans (Torroni et al. 1996a, Richards et al. 1998)... Six of the European haplogroups (H, I, J, K, T and W) are essentially confined to European populations (Torroni et al. 1994, 1996a), and probably originated after the ancestral Caucasoids became genetically separated from the ancestors of the modern Africans and Asians."

mtDNA Haplogroup N1a while presently rare (0.18%-0.3%) occurred in as many as 25% of Neolithic Europeans and has subsequently been absorbed into the current populations.

Paleolithic
The prehistory of the European peoples can be traced by the examination of archaeological sites, linguistic studies, and by the examination of the DNA of the people who live in Europe now, or from recovered ancient DNA. Much of this research is ongoing, with discoveries still being continually made, and theories rise and fall.



Modern humans (Cro magnon) began to colonize Europe about 40,000 years ago, as evidenced by the spread of the Aurignacian culture. Modern humans may have arrived along two major routes either side of the Black Sea. Very quickly - by about 25,000 years ago - the prior inhabitants (our cousin species H. neanderthalensis) were either killed off or absorbed into the population and ultimately became extinct. About 22,000 years ago the last Ice Age (often referred to as the Last Glacial Maximum or LGM) began, rendering much of Europe uninhabitable. Humans may only have occupied certain regions of Europe at this time, these are often called refuges (or refugia) and were located along the northern Mediterranean and Black Sea coasts, as well as in the Balkans. As the glaciers receded from about 16,000 years ago, the populations that had occupied the refuges are thought to have begun to spread and colonise northern Europe. The population of Europe were hunter-gatherers until the advent of agriculture about eight millennia ago.

Neolithic migration
The largest admixture to the European paleolithic stock is due to the neolithic revolution of the 7th to 5th millennia BC. Some academics theorise that farming was introduced by people who migrated from the Near East, and that these farmers introduced Indo-European languages to Europe. This theory is typically associated with the Anatolian hypothesis of Indo-European origins, though it has also been argued that widepread migration is not necessary to support the theory.

Bronze Age migrations
Other theories about the origins of the Indo-European language center around a hypothetical Proto-Indo-European people, who are traced, in the Kurgan hypothesis, to somewhere north of the Black Sea at about 4500 BCE,They domesticated the horse, and spread their culture and genes across Europe. It has been difficult to identify what these "Kurgan" genes might be. Another approach – the Anatolian hypothesis – suggests an origin in Anatolia.

To what extent Indo-European migrations replaced the indigenous Mesolithic peoples is debated, but a consensus has been reached that technology and language transfer played a more important role in this process than actual gene-flow.

Uralic influence
According to a study conducted by four scientists, including Cavalli-Sforza LL:

"Principal coordinate analysis shows that Lapps/Sami are almost exactly intermediate between people located geographically near the Ural mountains and speaking Uralic languages, and central and northern Europeans. Hungarians and Finns are definitely closer to Europeans. An analysis of genetic admixture between Uralic and European ancestors shows that Lapps/Sami are slightly more than 50% European, Hungarians are 87% European, and Finns are 90% European. There is basic agreement between these conclusions and historical data on Hungary. Less is known about Finns and very little about Lapps/Sami."

North and Northeast African influences
There are a number of genetic markers which are characteristic of Horn African and North African populations which are to be found in European populations signifying ancient and modern population movements across the Mediterranean. These markers are to be found particularly in Mediterranean Europe but some are also prevalent, at low levels, throughout the continent. The spread of the Megaliths and its Cultures seem to have been carried, or kept maritime connections with, the Mediterranean and Northern Africans.

Y-chromosome DNA
The general parent Y-chromosome Haplogroup E3b, originating in modern day Somalia, is by far the most common in North and Northeast Africa, and is also common throughout the majority of Europe, particularly in Mediterranean and South Eastern Europe, reaching its highest concentration in Greece and the Balkan region, but also with an important presence in other regions such as Hungary, Italy, Iberia and Austria. . .

Outside of East Africa, E3b's two most prevalent clades are E-M78 (E3b1a) and E-M81 (E3b1b, formally E3b2 also known as "Berber marker").

E-M78 is the most common subclade of E3b and is present throughout Europe. Its main route of entry into the European continent was through Anatolia during the Neolithic, explaining the high frequency of this haplotype in the Balkan region. It is also relatively frequent in the Mediterranean countries. Unlike, E-M81 which is thought to have entered Europe in more recent times, this haplotype denotes a very ancient East African contribution to the European gene-pool.

A study from Semino (published 2004) showed that Y-chromosome haplotype E-M81 (E3b1b), is specific to North African populations and almost absent in Europe except the Iberia (Spain and Portugal) and Sicily. Another 2004 study showed that E-M81 is found present, albeit at low levels throughout Southern Europe (ranging from 1.5% in Northern Italians, 2.2% in Central Italians, 1.6% in southern Spaniards, 3.5% in the French, 4% in the Northern Portuguese, 12.2% in the southern Portuguese and 41.2% in the genetic isolate of the Pasiegos from Cantabria). The findings of this latter study contradict a more thorough analysis Y-chromosome analysis of the Iberian peninsula according to which haplogroup E-M81 surpasses frequencies of 10% in Southern Spain. The absence of microsatellite variation suggests a very recent arrival from North Africa consistent with historical exchanges across the Mediterranean during the period of Islamic expansion, namely of Berber populations. . However a more thorough study about Y-chromosome lineages in Portugal revealed that "The mtDNA and Y data indicate that the Berber presence in that region dates prior to the Moorish expansion in 711 AD... Our data indicate that male Berbers, unlike sub-Saharan immigrants, constituted a long-lasting and continuous community in the country".

Haplotype V(p49/TaqI), a characteristic North African haplotype, may be also found in the Iberian peninsula, and a decreasing North-South cline of frequency clearly establishes a gene flow from North Africa towards Iberia which is also consistent with Moorish presence in the peninsula.. This North-South cline of frequency of halpotype V is to be observed throughout the Mediterranean region, ranging from frequencies of close to 50% in southern Portugal to around 10% in southern France. Similarly, the highest frequency in Italy is to be found in the southern island of Sicily (28%).

A wide ranging study (published 2007)using 6,501 Y-chromosome samples from 81 populations found that: “Considering both these E-M78 sub-haplogroups (E-V12, E-V22, E-V65) and the E-M81 haplogroup, the contribution of northern African lineages to the entire male gene pool of Iberia (barring Pasiegos), continental Italy and Sicily can be estimated as 5.6%, 3.6%, and 6.6%, respectively.”

Mitochondrial DNA
Genetic studies on Iberian populations also show that North African mitochondrial DNA sequences (haplogroup U6) and sub-Saharan sequences (Haplogroup L), present values which are much higher than those generally observed in Europe , although very low levels of Haplotype U6 have also been detected in Sicily. It happens also to be a characteristic genetic marker of the Saami populations of Northern Scandinavia. It is difficult to ascertain that U6's presence is the consequence of Islam's expansion into Europe during the Middle Ages, particularly because it is more frequent in the north of the Iberian Peninsula rather than in the south. In smaller numbers it is also attested too in the British Islands, again in its northern and western borders. It may be a trace of a prehistoric neolithic/megalithic expansion along the Atlantic coasts from North Africa, perhaps in conjunction with seaborne trade. One subclade of U6 is particularly common among Canarian Spaniards as a result of native Guanche (proto-berber) ancestry.

On the other hand, the distribution of mtDNA Haplogroup L, is consistent with modern historical data, being more frequent in Iberia than in the rest of Europe and more frequent in the south of the peninsula than in the north. Islamic domination, as well as the slave trade, is likely to have been a factor leading to its presence in some modern-day southern Iberian populations.

Sub-Saharan admixture
Finally, aside from E3b, sub-Saharan African DNA is scattered throughout the European continent. Not every population has been studied yet, but enough have so that a picture is starting to emerge. The amount of black admixture in Europe today ranges from a few percent in Iberia to almost nil around the Baltic. It seems to show a decreasing cline from the southwest to the northeast, which corresponds with the areas most affected by the African slave trade.

According to a summary study by Pereira et al. 2005, sub-Saharan mtDNA L haplogroups were found at rates of 3.83% in Iberians (Portuguese and Spanish), 2.86% in Sardinians, 2.38% in Albanians, 1% in the British/Irish,  0.94% in Sicilians, 0.62% in a German-Danish sample.

Sub-Saharan African Y-chromosomes are much less common in Europe, for the reasons discussed above. However, Haplogroups E(xE3b) and Haplogroup A spread to Europe due to migrations from Northeast Africa, rather than the slave trade. Haplotype A has been detected in Portugal (3%), France (2.5% in a very small sample), Germany (2%), Sardinia (1.6%), Austria (0.78%), Italy (0.45%), Spain (0.42%) and Greece (0.27%). By contrast, North Africans have about 5% paternal sub-Saharan West African admixture.