June 26, 2016

Ancient genomes from Neolithic West Asia

This week we got to know a lot more about the genetics of ancient West Asians, from the Mesolithic, Neolithic and later times. All in a single major study:

Iosif Lazaridis et al., The genetic structure of the world's first farmers. BioRxiv 2016. Freely accessible (pre-pub)LINK [doi: http://dx.doi.org/10.1101/059311]


We report genome-wide ancient DNA from 44 ancient Near Easterners ranging in time between ~12,000-1,400 BCE, from Natufian hunter-gatherers to Bronze Age farmers. We show that the earliest populations of the Near East derived around half their ancestry from a 'Basal Eurasian' lineage that had little if any Neanderthal admixture and that separated from other non-African lineages prior to their separation from each other. The first farmers of the southern Levant (Israel and Jordan) and Zagros Mountains (Iran) were strongly genetically differentiated, and each descended from local hunter-gatherers. By the time of the Bronze Age, these two populations and Anatolian-related farmers had mixed with each other and with the hunter-gatherers of Europe to drastically reduce genetic differentiation. The impact of the Near Eastern farmers extended beyond the Near East: farmers related to those of Anatolia spread westward into Europe; farmers related to those of the Levant spread southward into East Africa; farmers related to those from Iran spread northward into the Eurasian steppe; and people related to both the early farmers of Iran and to the pastoralists of the Eurasian steppe spread eastward into South Asia.


  • There were (at least) two clearly distinct populations in West Asia in the Mesolithic and Early Neolithic times.
  • Both populations contributed to the West Anatolian farmers that are precursors of the settlers of Neolithic Europe.
  • The so-called "Basal Eurasian" component is not yet clarified if it is something local or admixture with Africans or both. However it is clear that it is associated with reduced Neanderthal admixture.
  • West Eurasian genetic composition can be now understood quite well as the mixture from four sources: two West Asian ones, favored by the Neolithic revolution, and two Paleo-European ones.

This graphic shows pretty well how the ancient populations of West Eurasia are expressed as a mixture of those four founder populations:

That is if you can get through the nomenclature, which is inherited in many cases from a long array of recent studies. I'm not even sure myself in many cases what samples exactly and where from are thrown in each category. But the most important part is that Iran_N and Levant_N are the two Neolithic-specific founder populations of the Fertile Crescent (yeah, N stands for "Neolithic", not "North") and that the other two founder populations from pre-Neolithic Europe are WHG (Epi-Magdalenian peoples from Western and Central Europe) and EHG (Eastern European hunter-gatherers, of Epigravettian culture and maybe even proto-Uralic in one case).

Then we see in the case of Europe how:

1. Anatolia_N (precursors of mainline European Neolithic) are a mix of both West Asian farmer groups, plus a sizable fraction of Western Paleo-european ancestry already.

2. This fraction of Western Paleoeuropeanness increases as the farmers expanded into Europe (EN) and then as there was probably some backflow of Western origins in relation to Megalithism and Bell Beaker (MNChL). But in general remains the same basic genetic composition and in no known case incorporates any Eastern Paleoeuropean component at all, not yet.

3. It is only with the Indoeuropean ("Kurgan") invasions reflected in the category LNBA, when the EHG component begins feeling very important in Europe. If I'm correct, all those samples are from Germany other areas of Central and North Europe, with the Iberian and Italian ones of similar chronology placed in the MNChL tag instead. The LNBA/MNChL contrast is not a strictly chronological analysis but an analysis by categories of ancestry that do overlap in time.

4. In Armenia instead, we see a decrease of the minor EHG component but then an increase in the MLBA ("middle and late Bronze Age") when Armenians arrive from the Balcans and Phrygia, conquering the pre-existing Hurro-Urartean peoples (whose language was probably related to Chechen and other NE Caucasian languages), which should correspond to the formation of Urartu and more specifically to the Hayasa-Azzi and Shupria stages, both considered Urartean (Hurrian). The WHG and Levant-N components we see since the Chalcolithic is similar to what we see in West Anatolia and probably reflect interactions corresponding to Central-Eastern Anatolia, Kurdistan and Syria, for which we have no direct ancient data yet.

Ancient samples (colored and labeled) projected on a PCA of modern West Eurasian populations (in gray):

For a reference on which are the modern populations in gray, a good reference is this older but fully labeled PCA by Olalde.

Briefly: Natufians fall on top of modern Palestinians, their slightly admixed Neolithic descendants fall between Palestinians and Jews, Middle Neolithic European Farmers fall on top of Sardinians, the so-called Europe-Steppe continuum (early Western Indoeuropeans) fall between Central Europe, France and the Balcans, most Western Europeans do not overlap with ancient samples but appear to have even greater Paleoeuropean admixture instead, etc.

Y-DNA Haplogroups

Iranian Mesolithic and Neolithic samples carried the following patrilineages:
  • Mesolithic: J(xJ2a1b3,J2b2a1a1)
  • Ganj Dareh Neolithic: P1(xQ,R1b1a2,R1a1a1b1a1b,R1a1a1b1a3a,R1a1a1b2a2a) and an undefined CT
  • Late Neolithic: G2a1(xG2a1a)

Meanwhile Palestinian Mesolithic and Neolithic samples carried: 

  • Natufian (Mesolithic): E1b1b1b2(xE1b1b1b2a,E1b1b1b2b), E1b1(xE1b1a1,E1b1b1b1), E1b1b1b2(xE1b1b1b2a,E1b1b1b2b), plus two undefined CT.
  • Pre-Pottery Neolithic B/C: H2, E(xE2,E1a,E1b1a1a1c2c3b1,E1b1b1b1a1,E1b1b1b2b), E1b1b1, T(xT1a1,T1a2a), E1b1b1(xE1b1b1b1a1,E1b1b1a1b1,E1b1b1a1b2,E1b1b1b2a1c), plus three ill-defined CT.

CT is the main pan-Eurasian macro-haplogroup and is not informative, except in Palestine because it implies exclusion of E.

Otherwise we see an important presence of E (mostly E1b1b) a lineage we know was carried by early farmers into Europe and that has ultimately African origins. It probably indicates migration of NE Africans into Palestine in the Mesolithic, something also supported by Archaeology. However these NE Africans were surely already mixed with Eurasian ancestry, which probably arrived to the Nile Basin in the early LSA, some 50-40 Ka ago. So it's a complex story of multiple admixture events in the continental crossroads that is Egypt and also Palestine and other nearby areas.

We also see G2a1 in Late Neolithic Iran, and this one is the main lineage brought to Europe by the early farmers if we are to judge on known ancient sequences (today it is not more important that E1b but it is maybe more evenly distributed). However we only see it in the Late Neolithic, so it may have originated further west.

We see too little J, only J(xJ1a,J2a1,J2b) in Chalcolithic Iran and in Bronze Age Jordan: J(xJ1,J2a,J2b2a) again and J1(xJ1a). I guess that a lot remains to be researched on this issue because J is by far nowadays the most common haplogroup of West Asia, and also impacted Europe and South Asia (J2) and North and NE Africa (J1).

On the issue of "Basal Eurasian": African or West Asian?

The question remains unanswered, as I said before but there are two clues: on one side the presence of E1b in Mesolithic and Neolithic Palestine clearly supports a direct NE African influence, also backed by archaeological evidence. But there is some nuance in the issue of FST distances that I want to highlight.

The distances are available in a very extensive supplementary table, so I took just a few to get a better understanding, not only of this issue but in general of the genetic distances of the four founder populations:

Quite ironically it is not the Natufians who are the closest to the African reference population (Yoruba) but the CHG, Iran-N and Levant-N groups. In fact the Natufians are the most distant ones after the WHG population. However this is tricky because the affinity to Yoruba may also be caused by the "ghost" Basal Eurasian population, claimed first of all by Lazaridis 2014, which would be a remnant of the Out of Africa Migration (not strictly African but close enough and impossible to discern from true African admixture in most analyses).

So we may imagine that the "Highlander" (CHG and Iran-N) populations were somehow influenced by that Basal Eurasian ghostly population, which might have survived in the Persian Gulf oasis, for example. Or whatever else.

The presence of the same or similar element in Levant-N reflects possibly admixture with Iran-N or a similar population, something that is implicit in the table above but I'll address below more explicitly.

If there is (and there must be, because of Y-DNA E1b) some African admixture in the Natufian population, it was very diluted already in the autosomal (general DNA) aspect before farming began.

Update (Jul 2): all the four paragraphs above are possibly misleading to some extent because, as several commenters have rightfully pointed out, generic drift alone just causes the effect of increased distance to general reference populations like Yoruba and Han, this genetic drift is caused by relative isolation, so it seems that Magdalenian Europeans (WHG) and Natufian Palestinians (Natufian) were both more isolated populations in general terms than the Iran-Caucasus-Eastern Europe ones, whose sheer numbers apparently kept them more similar to the generic root of Humankind, less endogamous. 

However, per archaeology, such "sheer numbers" are not to be expected in that area, rather the opposite (Western Europe and Palestine are much more richer areas in terms archaeological, suggesting denser populations). So the question remains open as far as I can tell but it should be discerner with more precise tools than mere FST.

A visual of smallest genetic distances between (each "-" represents 0.01 in the table above):

a) Ancient West Asians:

Neolithic peoples of West Asia, even if different, are closer among them than their pre-Neolithic precursors.

b) Pre-Neolithic West Eurasians:

The distances between Natufians and everyone else are comparable to those with Han Chinese, however only in the case of the populations that appear to have extra affinity to East Asia (Iran, Caucasus and Eastern Europe), otherwise it is smaller.
All four populations were distant enough from each other to be considered clearly distinctive. Even EHG and WHG were quite dissimilar.

c) The four West Eurasian founders considered above:

There is much greater similitude between Iran and Levant Neolithic peoples than between their Mesolithic precursors. This implies some sort of intense admixture as agriculture and herding developed. Not enough to erase the differences but enough to blur them significantly.

Genetic influence from East Asia or a related population is also apparent in all Northeastern populations but even more so in Iran Neolithic. Why?

There is much more in the study and supp. materials but I can only review so much.

June 9, 2016

Neolithic DNA from Greece and NW Anatolia and their influence on Europe

This is a most interesting study that brings to us potentially key information on the expansion of European Neolithic and the formation of modern European peoples.

Zuzana Hofmanová, Susanne Kreutzer et al., Early farmers from across Europe directly descended from Neolithic Aegeans. PNAS 2016. Open accessLINK [doi:10.1073/pnas.1523951113]


Farming and sedentism first appeared in southwestern Asia during the early Holocene and later spread to neighboring regions, including Europe, along multiple dispersal routes. Conspicuous uncertainties remain about the relative roles of migration, cultural diffusion, and admixture with local foragers in the early Neolithization of Europe. Here we present paleogenomic data for five Neolithic individuals from northern Greece and northwestern Turkey spanning the time and region of the earliest spread of farming into Europe. We use a novel approach to recalibrate raw reads and call genotypes from ancient DNA and observe striking genetic similarity both among Aegean early farmers and with those from across Europe. Our study demonstrates a direct genetic link between Mediterranean and Central European early farmers and those of Greece and Anatolia, extending the European Neolithic migratory chain all the way back to southwestern Asia.

Uniparental DNA

One of the most important findings is that the two Epipaleolithic samples from Theopetra yielded mtDNA K1c, being the first time in which haplogroup K has been detected in pre-Neolithic Europe. Sadly enough these two individuals could not be sequenced for full genome. 

The other five individuals are all Neolithic (three early, two late) and did provide much more information.
  • Rev5 (c. 6300 BCE): mtDNA X2b
  • Bar31 (c. 6300 BCE): mtDNA X2m, Y-DNA G2a2b
  • Bar8 (c. 6100 BCE): mtDNA K1a2
  • Pal7 (c. 4400 BCE): mtDNA J1c1
  • Klei10 (c. 4100 BCE): mtDNA K1a2, Y-DNA G2a2a1b (same as Ötzi's)
I color coded their abbreviated names according to the usage in the study's many maps, for easier reference: green shades are for Greece (Western Macedonia), red shades for Turkey (Bursa district). It is also very convenient to get straight their real geography because many of the map-styled graphs are not precise at all about that:

Fig. 1.
North Aegean archaeological sites investigated in Turkey and Greece.

Autosomal DNA affinities

This is probably the most interesting part. There is a lot about it in the supplementary information appendix but I find that the really central issue is how they relate to each other (or not) and to other ancient and modern Europeans. I reorganized figs S21 and S22 to better visualize this:

Ancient samples compared to each other and other ancient samples ("inferred proportions of ancestry")
Ancient samples compared to modern Europeans ("inferred proportions of ancestry")

So what do we see here? First of all that the strongest contribution of known Aegean Neolithic peoples on mainline European Neolithic is from Bar31, which is from NW Anatolia, and not from Greece. Bar8 is a less important contributor but may have impacted particularly around the Alps (Stuttgart-LBK, modern North Italians).

This goes against most archaeology-based interpretations, which rather strongly suggest a Thessalian and West Macedonian origin of the Balcanic and, therefore, other European branches of the mainline Neolithic of Aegean roots, and do instead support some sort of cultural barrier near the European reaches of the Marmara Sea. Of course we lack exhaustive sampling of Greek Neolithic so far, so it might be still possible that other populations from Thessaly or Epirus could have been more important. However the lack of Anatolian-like influence on the Western Macedonian Neolithic until c. 4100 BCE, makes it quite unlikely.

So it seems that, once again, new archaeogenetic information forces us to rethink the interpretative theories based on other data.

However we do see a strong influence of Greek Neolithic and particularly the oldest sample, Rev5, in SW Europe, very especially among Basques, who seem to have only very minor Anatolian Neolithic ancestry, unlike everyone else relevant here. This impact is also apparent in Sardinia and to some extent North Italy (but overshadowed in these two cases by the one from Anatolia, particularly Bar31).

There are also similar analyses for other four ancient samples (Lochsbour, Stuttgart, Hungary Neolithic and Hungary Bronze) but they don't provide truly new information, so I'm skipping them here. As I said before, there's a hoard of analyses in the SI appendix, enjoy yourselves browsing through them and feel free to note in the comments anything you believe important.

A synthesis of the various "inferred proportions of ancestry" analyses is anyhow shown in fig. 3:

Fig. 3. (click to expand)
Inferred mixture coefficients when forming each modern (small pies) and ancient (large pies, enclosed by borders matching key at left) group as a mixture of the modern-day Yoruba from Africa and the ancient samples shown in the key at left.

The fractions may be misleading however, especially for the ancients. For example: Lochsbour (a total outlier among the ancients in this study) appears best correlated with Pal7 but in fig. S24 it is clear that does no correlate with any Neolithic sample at any significant level. But in general terms it can give a good idea of where does ancestry, particularly for modern samples, come from.

Note: elsewhere someone was being a crybaby about the Polish sample (may well be an error) or the Kalmyk sample (who are obviously most related to East Asians, not used here) but those are minor issues.

Of course there's a lot more to learn from the remains of the ancients. Let's keep up the good work.

June 6, 2016

MtDNA U6 in Aurignacian Europe

The U6 haplogroup of Pestera Muierii is officially confirmed. 

Extra-officially, it also seems confirmed mtDNA H in Magdalenian El Mirón, another of the haplogroup challenged (without any reasoning) by Fu et al. In this last case, my sources suggest that Fu surely tested a bone belonging to a different individual, because the heap of bones could well include several people and the bones tested by Hervella (a tooth) and Fu (a femur) were different.

Anyhow, to the matter at hand:

Montserrat Hervella et al. The mitogenome of a 35,000-year-old Homo sapiens from Europe supports a Palaeolithic back-migration to Africa. Nature 2016. Open accessLINK [doi:10.1038/srep25501]


After the dispersal of modern humans (Homo sapiens) Out of Africa, hominins with a similar morphology to that of present-day humans initiated the gradual demographic expansion into Eurasia. The mitogenome (33-fold coverage) of the Peştera Muierii 1 individual (PM1) from Romania (35 ky cal BP) we present in this article corresponds fully to Homo sapiens, whilst exhibiting a mosaic of morphological features related to both modern humans and Neandertals. We have identified the PM1 mitogenome as a basal haplogroup U6*, not previously found in any ancient or present-day humans. The derived U6 haplotypes are predominantly found in present-day North-Western African populations. Concomitantly, those found in Europe have been attributed to recent gene-flow from North Africa. The presence of the basal haplogroup U6* in South East Europe (Romania) at 35 ky BP confirms a Eurasian origin of the U6 mitochondrial lineage. Consequently, we propose that the PM1 lineage is an offshoot to South East Europe that can be traced to the Early Upper Paleolithic back migration from Western Asia to North Africa, during which the U6 lineage diversified, until the emergence of the present-day U6 African lineages.

The interesting part is that today U6 is pretty much constrained to Northwest Africa and parts of Iberia and it has usually been considered until now as a North African haplogroup, even if of Eurasian derivation. 

Fig. 2 - (A) Phylogenetic analysis and temporal estimates for lineages including the Peştera Muierii-1 (PM1) from the mitochondrial tree. (B) Location of the Peştera Muierii cave and surface map based on current frequencies of U6 lineages30; the European borders map was generated in ArcMap 10.1 (ESRI, http://www.esri.com) by modifying the World Borders Dataset (http://www.thematicmapping.org/downloads/world_borders.php), which is licensed under the Attribution-ShareAlike 3.0 Unported license. The license terms can be found on the following link: http://creativecommons.org/licenses/by-sa/3.0/ (This map was created by A.A.).

Another interesting bit is that U6(xU6a'b'd,U6c), U6* for short, is not known to exist today anymore. So it is reasonable to speculate about the "ancestral" position of Muierii in the lineage, regardless of whether Muierii-2 was a true ancestor or just a more or less distant relative of the real ancestor of modern day U6 carriers. 

Complementary information is to be found Secher et al. (2014), which refined the knowledge of the U6 mitochondrial haplogroup, unveiling that the key basal (and rare) U6c sublineage is not only found in Morocco (as known earlier) but also in Europe. Specifically U6c, which hangs directly from the U6 root node, is found in: Hispanic America (5.7% of all U6 carriers), Spain (2.2%), Canada (12.5%), NW Europe (16.7%), Morocco (4.5%), Algeria (10%) and Tunisia (5.9%). It is missing in Brazil, Western, Central and East Africa, Romani ("Gypsies"), Jews, Azores, Madeira, Canary and Cape Verde Islands, Portugal, Central and Eastern Mediterranean, West Sahara, Mauritania and USA (African-Americans,  European-Americans and Hispanics).

Figure 1
Surface maps, based on HVI frequencies (in o/oo), for total U6 (U6), total U6a (Tot U6a), U6a without 16189 (U6a), U6a with 16189 (U6a-189), U6b'd, U6c, U6b and U6d.

While the exact pattern of U6 expansion is not clear except for Africa (with a Moroccan origin surely), Sacher et al. believe that at least this part is related to the Iberomaurusian (aka Oranian) culture, which seems primarily an offshoot of Iberian Solutrean, also with origin in North Morocco (Taforalt) and European-like human looks (Cromagnoid).

Another complementary reference is Carmela L. Hernández et al. (2015):

An inspection of the U6 phylogenetic tree (S1 Dataset) showed that it is not easy to infer whether Iberia or North Africa bear more basal lineages. (...) The U6c (9.9 ky [5.0–15.0]) and U6d (12.0 ky [6.9–17.3]) are present in Iberia, Europe and North Africa at low frequencies.

While she seems to support a North African origin, the data is in fact somewhat contradictory:

Fig 5. Founder analysis for mtDNA U6 haplogroup. The plots show probabilistic distributions of U6 founder clusters for HVS-I sequences (A) and complete genomes (B) across migration times scanned at 200-year intervals from 0 to 60 ky.

Fig 7. Bayesian Skyline Plot (BSP) analysis of entire mtDNA U6 sequences.
Temporal changes of the effective population size, Ne in sub-Saharan Africa (brown color), North Africa (green color), and Iberian Peninsula (red color) are depicted. Solid lines represent the median values for the log10 of Ne on the Y-axis within each analyzed geographic region. The 95% HPD (highest posterior density) interval is shown for the three distributions (dashed lines).
Notice that the "LGM" label is very wrong: it should be around 21.000 years ago!

Usually U6 genetic history is envisioned as a migration from southwest Asia through North Africa [50]. This hypothesis is based on the general origin of haplogroup U sub-clades in Southwest Asia, which is also the center of the geographical distribution of U sub-clades: Europe, India, Central Asia, East Africa and North Africa. Two possible scenarios for the first U6 haplotype (bearing mutations 3348 and 16172) can be advanced: i) these mutations aroused in the founder region but did not leave any genetic legacy in current human populations there; ii) they originated probably somewhere in North Africa, after the arrival of the U6 founder haplotype. Within North Africa U6 is only significantly frequent at its western edge (as well as in South-western Europe). More importantly, all the most basal branches are virtually restricted to that region (U6b, U6c and U6d), what could indicate its western origin. Nevertheless, it cannot be excluded the major sub-clade U6a, which shows a richness of sub-clades in Northwest Africa [29] although a few of derivative branches also include sequences from East African and the Middle Eastern populations (e.g. U6a2).

Her conclusions (insisting on an African origin and first arrival via Egypt) are not something I can share at this stage of the research but her data is clearly very interesting and, combined with the rest, useful in discerning the possible route of primeval U6 to the Gibraltar Strait area, where it found no doubt its niche for consolidated expansion. 

After the Muierii finding the question is open: did primeval U6 arrive to North Africa via Iberia, being pruned in Europe afterwards just because of genetic drift and the sizable impact of Paleolithic migrations in low density areas? I cannot be 100% sure but I would say it is a very likely conclusion based not just on Muierii but also on the rather high basal diversity of U6 in Iberia (and surprisingly NW Europe!) and also on the archaeological data that makes almost necessary to root the first Upper Paleolithic of NW Africa (the Iberomaurusian) in the Iberian Solutrean.

(Special thanks to Jean Lohizun again).

Update (Jun 17):

The Hernández 2015 paper also mentions that  U6a1 appears to be of European and specifically Portuguese origin:

Our U6 tree built from mitogenomes shows that U6a1 is predominantly European because it contains a significant number of sequences of Mediterranean individuals mainly from the northwestern shore with a leading Iberian contribution (21 of the 29 European samples) and has an ancestral node in Portugal (accession number HQ651694).

Thanks to Geog M. for highlighting this important detail.