Wednesday, May 2, 2012

The Latest Y-DNA Phylogeny

A new methodology has produced the latest Y-DNA phylogeny for modern humans.  As Dienekes explains, the gist of the difference is that a much more complete and hence stable and robust set of data about Y-DNA haplogroups is used to make mutation rate based age estimates for Y-DNA haplogroups while 1000 genomes data from 526 individuals (only men have Y-DNA, so only about half of the 1000 genomes can be used to look at Y-DNA).

Particularly notable is the fact that the first back of napkin estimates made using this method has Y-DNA haplogroup A (associated mostly with Paleoafrican relict populations) splitting from the rest of the modern human Y-DNA tree at about the same time as the mtDNA tree starts to split (about 140,000 years ago) and about 85,000 years before the rest of the modern human Y-DNA tree starts to split up (starting about 55,000 years ago).  In contrast, Y-DNA haplogroup B splits off less than five thousand or so years before DE and CF break off into distinct phylogenetic branches.  All of the absolute dates are subject to uncertainties in a handful of parameters like an assumed constant mutation rate, and an assumed generation length, that haven't been very precisely calibrated.

I'm a bit skeptical of also skeptical of the big conclusions reached in this first stab at an absolutely dated age based on SNP mutations because the data set contains only one Y-DNA haplotype A and only two Y-DNA haplotype B individuals.  Even with the very rich data provided by more than eighteen thousand SNP data points per individual in the 1000 genomes data, small sample sizes always leave a certain amount of room for doubt, and a fluke individual can distort the data. 

It is also worth recalling that Y-DNA is subject to pretty powerful selective pressure and that mutation rate dating in Y-DNA has already experienced one bout of being cast deeply into doubt already for a variety of reasons, some of which, like differential mutation rates between STR loci, have the potential to be relevant in SNP dating as well, even if the new measure should theoretically be more robust.

And, as I understand the matter, not all of the Y-DNA chromosome is non-recombining.  It isn't clear to me if this data includes exclusively non-recombining parts of the Y-DNA chromosome.  Particularly in the case of Paleoafricans where there have been some hints of cryptic admixture with one or more unknown archaic hominins, a purely tree like phylogenetic mutation rate dating approach may be inappropriate to the extent that the Y-DNA data contains recombining parts of the Y-DNA chromosome.  This is less of a concern with non-African, non-Melanesian individuals, since Neanderthal admixture is roughly identical and at a fixation level within this population.

It is reassuring, however, that the tree produced with this independent SNP phylogeny is almost precisely the same in how it classified individuals into particular branches of the tree as the old method.

No comments: