Tuesday, February 5, 2008

Cruciani et al. 2007: Sorting Out a Complex Network of Clusters

E-M78 (E3b1) network:

Introduction
Both E3a and E3b are found outside of their African homeland, but '''E3b''' is relatively more frequent in Europe and western Asia than its sister clade E3a, thanks to gene flow from North Africa and to a lesser degree, from the African Horn in sub-Saharan East Africa [to the other side of the Red Sea], where E3b bearing chromosomes appear to be the most prevalent. The most commonly distributed E3b sub-branch is '''E-M78'''. The flow of E3b can be summarized into about four main episodes, based on geographic and quantitative analysis of haplogroup and micro satellite diversity:
  • Sometime in the Upper Paleolithic, between 23.9 and 17.3ky ago, E3b (M215) bearing chromosomes were introduced to northeast Africa from sub-Saharan East Africa.
  • The M78 mutation ('''E3b1''') then occurred in the E3b chromosomes distributed in Northeast Africa, to be followed by a back-migration episode to sub-Saharan East Africa, sometime between 18ky and 5.9ky ago. Some chromosomes which had acquired the M78 mutation in Northeast Africa undoubtedly also made their way westward in North Africa.
  • Sometime around 13ky ago, these M78 bearing E3b chromosomes were introduced to Europe directly from northern Africa.
  • Between 20 and 6.8ky ago, M78 bearing E3b chromosomes were introduced into western Asia from Northeast Africa. [Cruciani et al. 2007]
Discussion
Sorting E-M78 chromosomes into well defined clusters, while there are certainly strong correspondances between binary markers and microsatellite clusters in certain cases, is anything but simplistic.

What Cruciani did in his efforts to place microsatellite clusters into clearly defined sub-clades more precisely, and identification of new sub-clades in the process, was to muddle up what is presently known about E-M78 macrohaplogroup demographic and biohistoric specificities, and he, himself, has taken note of this fact:

Thus, even though they represent only 5% of the total E-M78 chromosomes analyzed, their inclusion into the respective haplogroups/paragroups **heavily affects inferences** about time and place of origin of these haplogroups/paragroups. — Cruciani et al., 2006.

...taken from the following extract:

Discrepancies between micro satellite clusters and haplogroups/paragroups

Despite the major congruence between E-M78 microsatellite clusters and binary haplogroups, there are important discrepancies between the trees generated by the two types of markers. First, the large majority of the delta chromosomes belong to the clades E-V22 and E-V12*, but a few representatives of both clades are found outside the delta cluster.

Within delta, E-V22 and E-V12* chromosomes are intermingled and not clearly differentiated by their microsatellite haplotypes.

Second, all of the cluster beta chromosomes belong to paragroup E-M78*. However, E-M78* also includes some non-beta chromosomes which are highly differentiated in their microsatellite haplotypes.

Third, there is a striking correspondence between the microsatellite clusters alpha and gamma and the binary haplogroups E-V13 and E-V32, respectively (Fig. IB). However, while all alpha cluster belong to E-V13, some E-V13 chromosomes are not contained in such a cluster. Conversely, all of the E-V32 chromosomes fall within cluster gamma (defined by the rare DYS19 11-repeat allele), but two gamma chromosomes are members of paragroups E-V12* and E-M78*.

Taking into account the above data, the previously described European cluster alpha and the northern African cluster beta are indeed confirmed as monophyletic groups of chromosomes, that, very likely, have their own binary markers yet to be discovered. Cluster alpha chromosomes constitute a major branch of the binary haplogroup V13, which, in turn, includes also a few, highly differentiated chromosomes - previously classified either in cluster delta or unclassified. All 29 chromosomes within cluster beta belong to the paragroup E-M78*, which is relatively rare and almost exclusively restricted to a single geographic region (i.e. northern Africa), thus a common origin for at least a large part of these is likely.

Different scenarios characterize clusters delta and gamma.


The presence of three E-V13 chromosomes within cluster delta and the exclusion of some E-V12 and E-V22 chromosomes demonstrate that cluster delta cannot be regarded as a monophyletic unit.

As for cluster gamma, we have established close phylogenetic relationships of its members - now classified as E-V32 chromosomes - with those belonging to E-V12* within the E-V12 clade (Fig. 2) These relationships go undetected through the micro satellite network (red and pink chromosomes in Fig. 1B), most likely due to recurrent mutations at micro satellite loci. An alternative explanation would be that V12 is within a terminal branch of the Y chromosomes tree; moreover, it was never found by sequencing 18 Y*(xM78) chromosomes representing deep branches of the Y chromosome phylogeny (data not shown). Thus, the new markers we have detected now offer the opportunity to explore in a better defined phylogenetic context the origin and distribution of the chromosomes belonging to haplogroup E-M78.

Also it is worth noting that twelve chromosomes that we were unable to assign to any cluster in the previous network analysis are now classified within four different haplogroups/paragroups (see Table 2 and Fig. 1B) as highly divergent microsatellite haplotypes. Thus, even though they represent only 5% of the total E-M78 chromosomes analyzed, their inclusion into the respective haplogroups/paragroups **heavily affects inferences** about time and place of origin of these haplogroups/paragroups.

Finally, although there is a strong correspondence between cluster gamma - defined by the rare DYS19 11-repeat allele - and haplogroup E-V32, the presence, in cluster gamma, of haplotypes belonging to the binary paragroups E-M78* and E-V12* can only be explained by admitting either a paraphyletic or a polyphyletic origin for the chromosomes in the cluster.

Overall these findings indicate that caution is needed when using the microsatellite alleles as surrogates of UEPs (e.g. Malaspina et al., 2001; Cruciani et al., 2004’ Di Giacomo et al., 2004; Sanchez et al.; 2005). —
Cruciani et al., 2006; Molecular Disection of the Y chromosome Haplogroup E-M78 (E3b1a): A Posterior Evaluation of a Microsatellite-Network-Based Approach Through Six New Biallelic Markers


He correctly notes [in the very last piece of this extract] the futility of treating clusters as lineages, for reasons made apparent already in the body of the extract.

Essentially, as noted above, all the 29 beta clusters fall into the E-M78* paragroup found predominantly in Northwest Africa, along with a few other clusters [gamma and unclassified clusters]. In his 2007 study, however, Cruciani had this to say:

It is also worth noting that the rare paragroup E-M78* has not been observed in eastern Africa; moreover, the two north-western African E-M78* chromosomes are well differentiated from the two north-eastern African E-M78* chromosomes (supplementary table 1) adding a new argument for a higher haplogroup diversity in northern Africa. — Cruciani et al., Tracing Past Human Male Movements in Northern/Eastern Africa and Western Eurasia: New Clues from Y-chromosomal Haplogroups E-M78 and J-M12, 2007.

Thus, the paragroup essentially consists of clusters sharing TMRCA, but didn't test positive for any known downstream mutations. So, the likely differences seen here by Cruciani, were of microsatellite cluster differences. It can happen, and has happened as noted above, when newly identified binary markers are stumbled upon in paragroups. The general assumption is that, in whichever population there is the highest concentration of the paraphyletic group of a clade, generally taken into consideration with considerable frequency and diversity, the high probability is that this is the 'representative' group wherein the clade was initially acquired and then spread out, and transmitted through the generations. Clades differentiated by microsatellite clusters in a paraphyletic group, could therefore be seen in the context of having undergone differentiation at the microsatellite level, but likely before downstream potential UEP mutations were acquired in the respective clusters [i.e. polyphlyetic clusters], that is—if we rule out independent parallel microsatellite mutations in the case at hand. So, generally speaking, a paraphyletic cluster of a clade with, i.e. no known downstream mutations of its sub-clades, are viewed as ancestral markers—no news here.

Another exemplary case, would be that of V12* paragroup, wherein clusters have yet to be differentiated into clearly identified V12 sub-clades:

Taking into account the above data, the previously described European cluster alpha and the northern African cluster beta are indeed confirmed as monophyletic groups of chromosomes, that, very likely, have their own binary markers yet to be discovered. — Cruciani et al. 2006.

So, why?, because as noted in the extract, for example...

there is a striking correspondence between the microsatellite clusters alpha and gamma and the binary haplogroups E-V13 and E-V32, respectively (Fig. IB). However, while all alpha cluster belong to E-V13, some E-V13 chromosomes are not contained in such a cluster. Conversely, all of the E-V32 chromosomes fall within cluster gamma (defined by the rare DYS19 11-repeat allele), but two gamma chromosomes are members of paragroups E-V12* and E-M78*. — Cruciani et al.,2006

This strong correspondence between certain clusters and binary markers, but NOT total correspondence, goes back to the point made about the polyphyletic clusters that Cruciani alluded to and that the present author briefly illustrated in a scenario above, concerning the paragroup.

Wherever V12 may have arisen [in Africa], it seems to be the ancestral lineage to E-V32, which has almost all of the gamma clusters identified and is found mainly in east Africa, particularly in the African Horn.

Earlier, the present author wrote:

"Clades differentiated by microsatellite clusters in a paraphyletic group, could therefore be seen in the context of having undergone differentiation at the microsatellite level, but likely before downstream potential UEP mutations were acquired in the respective clusters [i.e. polyphlyetic clusters], that is - if we rule out independent parallel microsatellite mutations in the case at hand. So, generally speaking, a paraphyletic cluster of a clade with, i.e. no known downstream mutations of its sub-clades, are viewed as ancestral markers—no news here."

And so, when looking at this...

there is a striking correspondence between the microsatellite clusters alpha and gamma and the binary haplogroups E-V13 and E-V32, respectively (Fig. IB). However, while all alpha cluster belong to E-V13, some E-V13 chromosomes are not contained in such a cluster. Conversely, all of the E-V32 chromosomes fall within cluster gamma (defined by the rare DYS19 11-repeat allele), but two gamma chromosomes are members of paragroups E-V12* and E-M78*. — Cruciani et al.,2006

...since, it appears that the clusters above are both monophyetic group of chromosomes, it would appear that the V13 was a potentially unique mutational single-event that occurred prior to the mutational events [in tandem repeat nucleotide sequences] characterizing the microsatellite cluster 'alpha', so that it would appear in association with two different microsatellite clusters. Essentially, V13 assumes the role similar to that characterized by the "clade" which identifies the paraphyletic group in the above mentioned scenario. Conversely, it would appear that cluster gamma came to being sometime after the M78 mutation, but before V12 mutation. This would likely explain why an M78 chromosome, which appears to be devoid of any known downstream M78 mutations, would have this microsatellite cluster as do V12* and its derivative V32. The V12 chromosomes that have the delta cluster, may well be the product of multiple mutational events—that is, convergent "parallel mutations". See:

The presence of three E-V13 chromosomes within cluster delta and the exclusion of some E-V12 and E-V22 chromosomes demonstrate that cluster delta cannot be regarded as a monophyletic unit

Now of course, in the event that the present author may have overlooked something here or there, the information herein is subject to modification.