Monday, September 12, 2016

Modern Southeast Asian Population Genetics


Above is the money chart from a recent paper in the European Journal of Human Genetics on population genetics in Island Southeast Asia prepared using four existing datasets and a new one focused on Southeast Asia. Hat tip to Bernard's Blog.

In broad outline, these results are consistent with the current paradigm and prior research in the area, although perhaps indicating a somewhat more complex mixture of ancestries than is often recognized.

In the K=9 run with the data, K=2 (blue) corresponds to East Asian, K=3 (red) corresponds to Papuan, K=4 (green) corresponds to South Asian, K=5 (light tan) corresponds to Austroasiatic, K=6 (darker brown) corresponds to Austronesian, and K= 1, 7, 8 and 9 corresponds to other global populations which contribute essentially no ancestry to the region are are not found in more than trace amounts outside the legend.

Conventional wisdom, supported by archaeological evidence and by much greater Denisovan admixture in Papuans than in Austroasiatic populations, suggests that the Papuans are the oldest layer, followed early on in the Upper Paleolithic era (i.e. tens of thousands of years ago) by the Austroasiatic component. 

The "Austroasiatic" component isn't very fine grained in this analysis and ignores distinctions that might exist between Austroasiatic, Tai-Kendai and Southern Chinese populations, and the analysis also does not make meaningful distinctions between the Northern Chinese and the Southern Chinese, even though both kinds of distinctions might plausibly be relevant to piecing together a historical narrative to fit the genetic picture.

Historically, Austronesian expansion took place starting around 3,000-4,000 years ago from a starting point (which concluded around 1,000 years ago at expanses as far as Madagascar, Hawaii and New Zealand) with one of the many indigenous populations of Taiwan, commencing prior to multiple waves of later mainland Asian mass migration to the island of Formosa. But, subsequent migration to Taiwan has caused a relict indigenous population of the Northern Philippines to be the closest modern genetic match to the original Austronesian mariners whose technologies made the settlement of Oceania possible.  The distribution of the Austronesian language family (illustrated below) corroborates the genetic and archaeological story. The genetic data indicate that the cultural impact of the Austronesians was outsized relative to their significant, but not overwhelming, genetic impact on Island Southeast Asia, as the Austronesians and other subsequent migrants almost completely wiped out the previous languages of indigenous people of Island Southeast Asia outside a Papua New Guinea. 


Austronesian Language Map Per Wikipedia

Linguistically, the Austroasiatic languages (whose range is illustrated on the map below) are most familiarly represented by Vietnamese, even though Vietnam itself has a larger Han Chinese genetic component than it does an Austroasiatic one, due to migrations from Chinese rice farming populations into mainland Southeast Asia in the Bronze Age or later, very roughly contemporaneous with the Austronesian expansions.

These Chinese mass migrations, however, had far less impact on Island Southeast Asia than they did on mainland Southeast Asia.  Other Austroasiatic languages are Khmer (a.k.a. Cambodian), Munda (spoken by some populations in Northern India who have ancestors in Southeast Asia), Nicobarese (spoken in the Southern Andamanese islands but not believed to be the language of the oldest layer of the indigenous Andamanese people whose language is widely classified as a language isolate).


Austroasiatic Language Map Per Wikipedia

South Asian migration to the region dates to about 2,500 years ago, but is very modest outside geographically adjacent Burma (a.k.a. Myanmar). Most of this migration was limited to South Asian elites who left only a little genetic footprint in the long run.

The supplementary materials also include an interesting TreeMix figure that highlights the ancestry of some notable negrito populations of the region which don't appear to be described in the figure above, although regrettably, the TreeMix figure is without a legend and the color scheme is not the same in this chart as in the one above from the same paper. The genetics of some of these populations are remarkably distinct from each other for populations that are often lumped in a single bucket by anthropologists.



The (closed access) paper and its abstract are as follows:

Alexander Mörseburg, et al., "Multi-layered population structure in Island Southeast Asians," European Journal of Human Genetics (15 June 2016).
The history of human settlement in Southeast Asia has been complex and involved several distinct dispersal events. Here, we report the analyses of 1825 individuals from Southeast Asia including new genome-wide genotype data for 146 individuals from three Mainland Southeast Asian (Burmese, Malay and Vietnamese) and four Island Southeast Asian (Dusun, Filipino, Kankanaey and Murut) populations. While confirming the presence of previously recognised major ancestry components in the Southeast Asian population structure, we highlight the Kankanaey Igorots from the highlands of the Philippine Mountain Province as likely the closest living representatives of the source population that may have given rise to the Austronesian expansion. This conclusion rests on independent evidence from various analyses of autosomal data and uniparental markers. Given the extensive presence of trade goods, cultural and linguistic evidence of Indian influence in Southeast Asia starting from 2.5 kya, we also detect traces of a South Asian signature in different populations in the region dating to the last couple of thousand years.

UPDATE September 14, 2016, an abstract from the ISBA7 PaleoBarn Conference via Eurogenes:
Origins and genetic legacy of the first people in Remote Oceania, Skoglund et al.
The appearance of people associated with the Lapita culture in the South Pacific ~3,000 years ago marked the beginning of the last major human dispersal to unpopulated lands, culminating in the settlement of eastern Polynesia ~1,000-700 years ago. However, the genetic relationship of these pioneers to the long established Papuan peoples of the New Guinea region is debated. We report the first genome-wide ancient DNA data from Asia-Pacific region, from four ~2,900 to ~2,500 year old Lapita culture individuals from Vanuatu and Tonga, and co-analyze them with new data from 356 present-day Oceanians. Today, all indigenous people of the South Pacific harbor a mixture of ancestry from Papuans and a population of East Asian origin that we find to be a statistical match to the ancient Lapita individuals. Most analyses have interpreted the ubiquitous Papuan ancestry in the region today-at least 25%-as evidence that the first humans to reach Remote Oceania and Polynesia were derived from mixtures near New Guinea prior to the Lapita expansion into Remote Oceania. Our results refute this scenario, as none of the geographically and temporally diverse Lapita individuals had detectable Papuan ancestry. These results imply later major human population movements, which spread Papuan ancestry through the South Pacific after the islands' first peopling.

5 comments:

terryt said...

Thanks for posting this. I've seen very little new stuff on SE Asia recently.

"followed early on in the Upper Paleolithic era (i.e. tens of thousands of years ago) by the Austroasiatic component".

Unlikely it was anywhere near that long ago. More like ten thousand years ago. Any longer than that and it would be impossible to discern Austroasiatic as a 'language family'. Other evidence too suggests Austroasiatic is nowhere near as old as the Upper Paleolithic.

andrew said...

The Austroasiatic language family may very well not be that old.

But, the genetic component called "Austroasiatic" is old and is carried by bearers of that in Burma speak a language in the same language family as Chinese, the bears of that in Thailand who speak the Thai-Kendai language, and the bearers of that in Island SE Asia speak Austronesian languages, in addition to Austronesian language speakers. So, it almost surely predates the LGM which created the land bridge that connects the parts of Island SE Asia to the mainland up to the Wallace Line.

I'd estimate that this genetic component is more than 20,000 years old and less than 50,000 years ago, with a best guess around 30,000 something based upon uniparental papers about the region that I read eons ago and don't have at my fingertips.

terryt said...

"But, the genetic component called 'Austroasiatic' is old"

Perhaps. But that is quite possibly because the language is much younger than the genes. We know that the Mongoloid phenotype was introduced to SE Asia no earlier than around 10,000 years ago and the Y-DNA O group are the only possible candidates for such an introduction.

"carried by bearers of that in Burma speak a language in the same language family as Chinese"

True. But interestingly Sino-Tibetan is primarily associated with what used to be called Y-DNA O3, which is absent on the Andaman Islands. And, once more, Sino-Tibetan cannot be particularly ancient or it would be unrecognisably as a family. Obviously something strange is going on, presumably because movement in SE Asia was much more complicated than is generally assumed.

"the bears of that in Thailand who speak the Thai-Kendai language"

I've not heard of a Thai-Kendai group, but Tai-Kadai is generally considered to be related to Austronesian, not Austroasiatic. Again we see complications in connecting language and genes.

"the bearers of that in Island SE Asia speak Austronesian languages"

All of which goes to show that the languuages and the male haplotypes were basically imposed on a pre-existing SE Asian population with enough genetic change to induce a change from a 'Papauan' to a 'Mongoloid' phenotype.

"I'd estimate that this genetic component is more than 20,000 years old and less than 50,000 years ago"

There were obviously humans through SE Asia long before 50,000 years ago as the Australians and Papuans must have arrived via that region. I agree that the Andamans may have been settled around 30,000 years ago but that leaves the explanation for the language up in the air. But there had been a huge amount of human movement back and forth through the neighbouring mainland by the time they were settled.

"So, it almost surely predates the LGM which created the land bridge that connects the parts of Island SE Asia to the mainland up to the Wallace Line".

Austroasiatic and, certainly, Austronesian both postdate the single landmass that was Sundaland. And SE Asia probably formed a single landmass several times before the LGM. In fact I would guess the first people to reach Australia arrived at a time when Borneo was connected to mainland SE Asia, and that was at least 50,000 years ago. However the Melanesians may be a bit later and arrived once Borneo had largely been isolated to at least some extent. The uniparental papers from years ago have almost certainly been outdated as they usually made the assumption that Y-DNA O was Paleolithic in the region and all the haplotypes had arrived together. Today the evidence indicates a much more complicated history of the region.

capra internetensis said...

The Y marker most associated with Austro-Asiatic is O2a1-M95. Ethnic Burmese have about 20%, and that is the low end. The Thai sample I know of had 43%, while men from Java and Bali had 49% and 57%.

According to Y-Full O2a1 is about 10 700 years old. However, the large majority of Southeast Asian O2a1 appears to belong to two subclades, O-M88 and O-F789/M1280, which are only about 5000 years old. All of the handful of South Asian and Island Southeast Asian O2a1 full sequences I know of belong under F789, with M88 found mostly in Southern China and Vietnam.

Though there isn't enough data to say for sure at this point, it seems likely enough that the main subclades of O2a1, Austro-Asiatic languages, and a substantial characteristic autosomal component all spread together with the Neolithic of Southeast Asia about 5000 years ago, though of course we wouldn't expect them all to have gone together only and at all times.

terryt said...

"According to Y-Full O2a1 is about 10 700 years old. However, the large majority of Southeast Asian O2a1 appears to belong to two subclades, O-M88 and O-F789/M1280, which are only about 5000 years old".

But, of course, that just indicates the time of expansion, not the time when they first separated from other O2a haplotypes. I would guess that Austroasiatic in SE Asia could well be as old as 10,000 years. But I think an earlier presence is very unlikely. On the other hand the 5000 year date is more than just a little interesting as it coincides quite closely with the probable expansion of Austronesian O1a's movement from Taiwan to the Philippines. Obviously haplotypes are not indelibly associated with particular language groups, but the initial association seems very strong in the case of East/SE Asia.

"All of the handful of South Asian and Island Southeast Asian O2a1 full sequences I know of belong under F789, with M88 found mostly in Southern China and Vietnam".

Thanks for that information.

"Though there isn't enough data to say for sure at this point, it seems likely enough that the main subclades of O2a1, Austro-Asiatic languages, and a substantial characteristic autosomal component all spread together with the Neolithic of Southeast Asia about 5000 years ago"

Ahh. I very much agree. In fact that's what I have been claiming for quite some time, and even been banned from Maju's blog for insisting on it being very likely so. But further, I would expect that O3's expansion, as well as that of O1, also contributed to the southward movement of the Mongoloid phenotype.

In passing I note that ISOGG has now joined O1 and O2 together as O1a and O1b respectively, making O3 now O2. I think such a connection has long been suspected.