Haplogroup G5 (Y-DNA)

In human genetics, Haplogroup G5 is a Y-chromosome haplogroup and is defined by the presence of the M377 mutation. It is a branch of Haplogroup G, which in turn is defined by the presence of the M201 mutation.

G5 is a major Y chromosome haplogroup, and yet very unique: It is extremely rare, almost completely specific to a single ethnic group in Europe, Ashkenazi Jews, and shows strong evidence of a very recent settlement in Europe. It has not been found in any other region until now except in a single Pashtun from the area of Pakistan bordering Afghanistan in the Hindu Kush range, and a single Burusho from the Hunza Valley in the Karakorum Range in Kashmir. These groups, the Ashkenazi Jews as opposed to the Pashtuns and Burusho, are very widely separated geographically, and have not been known to have had any contact whatsoever in their respective histories.

G5 presents more mysteries regarding its origin and distribution than virtually any other major Y haplogroup. Haplogroups that are rare in certain regions are more common in another, and have rather clear origins in other places where they are more commonly found. G5 has none of these. It is most common by far in a region where it arrived very recently, but exceedingly rare in other region including it likely area of origin. The distribution of G5 is incredibly sparse and dispersed, with no G5 haplotypes found in very large intervening regions. This pattern is unlike almost any other haplogroup.

Phylogenetic position
Extensive SNP testing by the Haplogroup G SNP project found that G5 is an independent branch of haplogroup G characterized by only one SNP, M377. A forthcoming study by the Y Chromosome Consortium at the University of Arizona found that within haplogroup G, G5 and G2 share a new SNP, P387, which is not found in the other well-attested branch of G, haplogroup G1. As a result, it will be proposed that Haplogroup G5 be renamed "G2c".

A study of median-joining networks generated for a set of approximately 180 67 Y-STR haplotypes in haplogroup G shows that G5 might cluster most closely with sub-haplogroup G2a.

Y-STR haplotype characteristics
All G5 samples tested so far have a null value for the DYS425 marker, (a missing "T" allele of the DYS371 palindromic STR), the result of a RecLOH event. This change is extremely uncommon in the rest of haplogroup G, but apparently happened early in the history of G5.

* DYS385, according to the Kittler Protocol which tests for the actual order of the alleles, determined that in G5 the smaller allele comes first and therefore that the order give above is correct.

Ashkenazi Jews
A cluster of closely related Ashkenazi Jews represent virtually all confirmed G5 persons worldwide, both from private testing, and from academic studies. G5 makes up about 7% of all Ashkenazi Jewish Y chromosome haplotypes, as was found in Behar et al. (2004) (n=442, GxG2=33). A much smaller group of Ashkenazi Jews however are in haplogroup G1, so not all GxG2 Ashkenazi Jews in the above study would be G5. The ratio of G5:G1 among Ashkenazim is approximately 10:1.

Eastern Europe
The distribution of G5 in Eastern Europe very closely reflects the 16th and 17th century settlement patterns of Ashkenazi Jews in the Polish-Lithuanian Commonwealth:

Western Germany
G5 is also found among Ashkenazi Jews from Western Germany. Jews were not allowed to reside in most parts of Germany in the 16th and 17th centuries, aside from the Frankfurt Jewish Ghetto. Jews were expelled in 1670 from Vienna and the Archduchy of Austria. After Khmelytsky's Pogrom in Poland in 1648, there began a migration of Jews from Poland and Lithuania to Western Germany, which accelerated and  continued into the 19th century. It isn't clear at this point whether German Jewish G5s represent an independent settlement, or the result of a migration from Eastern Europe (although there is some evidence for the latter).

Southern Italy and Sicily
Among Europeans, there are a few significant exceptions to this almost exclusive Ashkenazi Jewish distribution - there are a small handful of samples from Southern Italians, or people with traditions of patrilineal descent from 15th century Sicily.

Eastern Anatolia
A confirmed G5 Y-STR haplotype found in the literature is haplotype 54 from a study of Anatolian Y chromosomes (n=523) by Cinnioglu et al. (2004) which was found in Eastern Turkey, in the city of Kars, very close to Armenia. Kars was the capital of the Armenian region of Vanand and for a short time the capital of the Armenian Bagratid Kingdom (928-961). From 963-1064 Kars was the capital of an independent Armenian kingdom. Kars was also settled by the Karapapak (formerly the Terekeme) Azeri Turkic minority, who also settled the adjacent areas of northwest Armenia.

Possible Armenian G5
In a study of Armenian Y chromosomes by Weale et al. (2001) (n=741), haplotype 108 in "Haplogroup 2" which consisted of Y haplogroups F*, G, and I, has a 6 Y-STR haplotype identical with that of the modal for Ashkenazi Jewish G5. These two samples were found in the "North" Armenian region and Nagorno-Karabakh. The samples from the North Armenian region were taken from the cities of Gyumri and Vanadzor. Gyumri is only 68km (42.5 mi) east of Kars.

The Hindu Kush and Kashmir (Pakistan)
There are just two other confirmed G5 samples that have been reported in the academic literature so far, one Pashtun in the Northwest Frontier Provinces of Pakistan (the Hindu Kush Range), and one Burusho in the Hunza Valley in Kashmir. These two G5s are Y-STR haplotypes 731 and 794 from Table 3 in the study by Sengupta et al. (2006) of Indian (n=728), Pakistani (n=176), and East Asian (n=175) Y chromosome lineages.

Other possible G5 Jewish haplotypes
Two possible G5 Y-STR haplotype samples in the literature are from the study of Jewish and non-Jewish Near Eastern Y chromosomes by Nebel et al. (2001) (in the Appendix Table A1), haplotype 51 which was found in 1 Ashkenazi Jew (n=79) and 3 Kurdish Jews (n=99), and haplotype 47 which was found in 1 Iraqi Jew (n=23). These also belong to what was termed at the time "Haplogroup 2", (F*,G, and I) and within this set of haplogroups these display a Y-STR allele pattern unique to haplogroup G5.

Major regions where G5 is not found
There is much negative evidence as to where where G5 is not found. The following studies failed to find any G5:
 * The Caucasus (aside from the above studies):
 * A study of Caucasian Y-STR haplotypes (n=364) by Nasidze et al. (2003), while it did no Y haplogroup testing, found no 9 Y-STR haplotype patterns within the range of variation found in G5.
 * Iran:
 * A large study of Iranian (n=150) Y chromosomes by Regueiro et. al (2006)
 * Iberia and North Africa:
 * A study of hundreds of Iberian (n=860) and North African Berber (n=75) Y chromosomes by Alonso et al. (2005)
 * Jordan:
 * A study of samples from Amman (n=101) and the Dead Sea area (n=45) in Jordan by Flores et al. (2005)
 * India
 * Sengupta et al. (2006) found no G5 haplotypes in India (n=728) and none in southern Pakistan.

Upcoming studies
Other Y chromosome samples taken from an upcoming study of Sephardi and  Near Eastern (Mizrahi) Jews have found only a few GxG2 (in Y chromosome haplogroup G but not in G2) samples. However, these include among others a couple of Turkish Jews, and single Moroccan,  Kurdish,  Iraqi, and  Yemenite Jewish samples. These are being tested for M377/G5.

The Kurdish and Iraqi Jewish samples from Nebel et al. (2002) are also being tested for M377/G5 by a different group for another study. Y chromosome haplogroup G1 is also found among Jewish populations, but it is likely that some of these samples will turn out to be in haplogroup G5.

Time to the Most Recent Common Ancestor (tMRCA)
The time to the Most Recent Common Ancestor (tMRCA) for European G5, derived by generating a median-joining network of over 25 haplotypes with 67 Y-STRs, yields a date of 450 years from the average birth year of the testees, with a standard deviation of 91 years. The mutation rate used is based on that of family groups with known ancestors. The date is close to the year 1492, a significant date in European Jewish history, when the Jews of Spain and Sicily (at that time ruled by Spain) were expelled or forced to convert. So far, this tMRCA includes all groups of European G5s, including it seems from preliminary evidence, the Italians.

This late tMRCA date for all of G5 in Europe raises the question of when G5 first entered Europe. If G5 entered Europe at an earlier period, we would expect to see more divergent haplotypes than we currently see. The very unusual highly ethnically-specific distribution of G5 in Europe combined with the very late tMRCA raises the question of from where G5 could have entered Europe. Also, was the spread of G5 in Europe from the Kingdom of Poland to Germany and Italy, from German to Italy and Poland, or Italy northward to both other areas? No one particular region seems to be more divergent than any other, and in fact, there doesn't seem to be any geographically correlated subclades within European G5, with samples from each region matching some from other regions more closely than ones from the same region.

Possible history
The most plausible scenario for the spread of G5 within Europe is an origin among Jews in Sicily, and a spread northward to Germany and the Polish-Lithuanian Commonwealth at the time of the expulsion of the Jews of Sicily in 1492.

It is estimated that Jews made up 6% or more of the population of Sicily in 1492. Historical evidence shows that most Sicilian Jews went eastward to the Ottoman Empire, where Sicilian Jewish congregations existed in Salonika and Constantinople until the late 19th century. However, it is known that many Sicilian Jews first went to Calabria, and then Jews were expelled from Calabria in 1524, and later from the entire Kingdom of Naples in 1540. Thre was a gradual movement throughout the 16th century of Jews in Italy from south to north, with conditions worsening for Jews in Rome after 1556 and Venice in the 1580's. Many Jews from Venice and the surrounding area migrated to Poland and Lithuania at this time.

In this scenario it may be that there was a direct migration from Sicily or Southern Italy separately to both Western Germany and Poland-Lithuania, but the presence of G5 in Germany may be due to a later migration from Eastern Europe to Germany starting with the aftermath of  Khmelnytsky's Pogrom in Poland in 1648,

Jews had lived in Sicily since Roman times. After the Byzantine reconquest of Sicily from the Arian Ostrogoths, who were very tolerant of the Jews, in 552. Under the Byzantine Empire few Jews lived in Sicily because of official persecution. Before 606 the bishop of Palermo ordered the synagogue to be converted into a church. An edict issued by Leo III the Isaurian in 722 which ordered the baptism by force of all Jews in the Empire. After the Muslim conquest of Sicily in 831-902, large numbers of Jews settled on the island. In 972, the Arab merchant Ibn Hawqal mentioned a Jewish Quarter in Palermo, and by 1170, Benjamin of Tudela reported 1500 Jewish households in Palermo and 200 in Messina. In 1149, Roger II forcibly brought the Jewish brocade, damask, and silk weavers of Thebes in Greece to Sicily to establish a silk industry there. This is an example of a late entry into Sicily of non-Iberian, non-Provençal Jews from outside of Western and Central Europe, from a region that has been poorly tested or devoid of Jews in modern times.

The preliminary conclusions from this evidence is that haplogroup G5 is not native to Europe. The very late tMRCA, and the very high ethnic specificity indicate a rather brief presence in Europe, but one that participated in the exponential growth of the Ashkenazi Jewish population in Eastern and Central Europe after the Black Death. The complete lack of G5 in Iberia and also so far among Spanish Jews indicates that G5 didn't come from Spain, or France, since some Spanish Jewish families originated in southern France and migrated to Spain after France expelled the Jews in 1306. This, along with the other evidence, leaves Sicily as the European origin of G5. We know that Greek and Mizrahi Jews arrived in Sicily as late as 1149,and that primarily most Sicilian Jews settled there during the Arab Emirate of Sicily. This is one way of explaining the very late presence of G5 in Europe, the likely presence of G5 among at least Kurdish Jews, if not other Mizrahi Jews as well.

The presence of G5 in the Hindu Kush and Kashmir is a complete mystery. Medieval accounts of the Israelite origin of the Pashtuns are contradicted by ancient sources, and also by the linguistic affiliations of the Pashto language. These stories were disseminated in Medieval times for religious reasons, and as part of the competition between the Mughals and the Pashtuns. The rarity of G5 in northeast Pakistan could indicate that G5 in this area originates outside the region and was brought there in the historic period from further west (this area was part of both the Achaemenid Persian Empire, conquered by Alexander the Great, and then formed a part of the Greco-Bactrian Kingdom). These two reported G5 haplotypes seem to be quite divergent from the both Ashkenazi Jewish clade and the lone northeastern Anatolian G5 based on only 10 Y-STRs, and therefore may not indicate a recent common origin. Another possible route which brought G5 to this region is through trade, because Hunza is a fertile valley that was a major stopping point along the southern Silk Road just before the Khunjerab Pass into China.

A Northern Near Eastern / South Caucasian origin for G5 is much more likely. G5 is found in Kars in Turkey, and very likely in the nearby Armenian town of Gyumri as well as in Nagorno-Karabakh further south and east. Again, the Jewish areas of Kurdistan were not far from this same region. Haplogroup G has its greatest diversity in this same area, where all recorded sub-haplogroups of G have been found, so the evidence seems to point to this region of Eastern Anatolia or south of the Caucasus as the area of origin for all of haplogroup G as well. G5 could have spread from this region eastward toward the Hindu Kush and the Karakorum ranges, and southward among the Judeans, and then subsequently westward with the Jewish Diaspora to Italy and then Central and Eastern Europe. If the evidence shows that no Sephardi or Mizrahi Jews have G5, then an origin with the Khazars of the Caucasus or another associated people who historically converted to Judaism during the Khazar Empire (c. 670-1017) although the particular regions where G5 is found in northeastern Anatolia and Armenia were never part of the Khazar Empire even at its greatest extent but always a part of Armenia.

Further avenues for research
More samples from Italy are very important. Also, deriving a tMRCA for all of European G5, including Italy is crucial. One would expect that the tMRCA including Italy would be a bit further back than circa 1492, but so far there is no evidence for this. Confirming whether there are other Jewish G5's that are not European, and then getting the tMRCA with these samples will indicate the immediate source of Ashkenazi Jewish G5. Then, testing the single non-Jewish Turkish sample for M377 would show whether an origin in the Near East is plausible. Then, finding more G5's in the Kashmir region and calculating a tMRCA, especially with the Turkish sample if it turned out to be G5, would indicate the direction of the gene flow.

Sites

 * Family Tree DNA Haplogroup G5 Project
 * Haplogroup G SNP project
 * Some Information and Theories on Haplogroup G

Maps

 * Map of G
 * Spread of Haplogroup G, from National Geographic
 * The 2006 ISOGG Y haplogroup tree

Mailing Lists

 * Haplogroup G5 discussion group
 * Haplogroup G Yahoo Group
 * Rootsweb Haplogroup G Mailing List
 * Family Tree DNA's Haplogroup G5 Project
 * FamilyTreeDNA's Haplogroup G Project