Thursday, October 19, 2017

Insights From Reading Selected ASHG 2017 Abstracts

The abstracts of the conference presentations at the American Society of Human Genetics Conference in Orlando, Florida for 2017 which is currently in progress is available here. It sorts by first digit of the paper number so, for example, 2, 20, 201 and 2001 are all adjacent to each other. Plenary and platform talks have numbers up to 372. Higher numbers are poster-presentations. Some notable papers:

1 African genes governing skin color

1437 Han Chinese genetics overview

2301 Russian haplotypes by geographic area (probably not, or at least not just, uniparental haplotypes).

2304 Two sources from region for mtDNA lineages in Austronesians. There is a narrative buried there that deserves to be inferred at some point if I get a chance. The abstract is as follows:
Accumulated archaeological, linguistic and genetic evidences suggest that modern Austronesian (AN)-speaking Melanesians are derived from the admixture of indigenous Non-Austronesian- (NAN-) speaking people in Near Oceania and Austronesian- (AN-) speaking people from Southeast Asia. In this study, we analyzed mitochondrial DNA (mtDNA) polymorphisms in D-loop region for two AN-speaking Melanesian populations (Munda and Kusaghe) and an AN-speaking Micronesian population (Rawaki) from New Georgia Island in the western Solomon Islands to trace the maternal lineage of AN-speaking Melanesians. The major mtDNA haplogroups in these three populations originated in Asia and the ‘Polynesian motif', which is well-characterized mtDNA marker for Polynesians, was frequently observed in the two AN-speaking Melanesian populations but not in the AN-speaking Micronesian population in New Georgia Island. Principle component analyses also revealed genetic proximity between AN-speaking Melanesians (Munda and Kusaghe) and AN-speaking Polynesians in Tonga. These results suggest that Polynesian ancestors have considerably contributed to maternal gene pool of AN-speaking Melanesians in the Solomon Islands before their expansion to Remote Oceania.
2305 Ancestral genetic components in Arabians with attention to Natufians benchmarked against ancient DNA and a Persian Gulf component that doesn't currently have an ancient DNA counterpart. It isn' clear if the Persian Gulf ghost population component is related to Caucasian hunter-gatherers and Caucasian early Neolithic, or if it is a third thing entirely. The abstract is as follows:
The extreme weather events of the Younger Dryas that left Arabia and most of the southern Levant dry drove Natufian expansion from the southern Levant to the Negev, North East Levant, and Mesopotamia while other cultures emerged in the Zagros and in Anatolia. The first Sumerian city states were established around 6 BCE, and the Dilmun culture in Arabia emerged as a major trading power connecting Sumer to the Harrupan civilization. Besides material remains, many of these cultures, including Natufians, left human remains whose genetic imformation provides geographical and chronological records of human migration and admixture starting nearly from the time of post glacial expansions to modern times. In this study, we sought to explore the population genetics of the Arabian Peninsula, identify likely ancient source populations, and investigate evidence of ancient migrations and admixture using PCA, ADMIXTURE, f3, ALDER, and LDA. ADMIXTURE identified two major contributors to the modern Arabian peninsula populations: an early expansion marked by Natufian aDNA samples, and a component Persian Gulf associated component. PCA distinguishes between the Natufian genetics and the Arabian Pensinsula group that ADMIXTURE combines. Gumuz and Somali admixture impacts the Peninsula, Egypt, and the Levant. Admixture between the Natufian genetic component and East Africa show significant negative f3 statistics admixing in Egypt, and similarly Egyptian and the Natufian genetic component admixing in East Africa in both directions. Ancient Anatolian genetics have penetrated the study region except for Yemen according to f3. While we found little gene flow as a candidate for Indo-European admixture in any direction, we did see patterns of expansions carrying Natufian genetics that marked expansion and admixture aligns with that of the proto-Semitic language.
2308 Peruvian genetics showing more diversity of indigenous ancestry than previous studies. Previous studies have shown very little continuity between ancient DNA and modern DNA in the region suggesting near total replacement in the post-Columbian era. But, this study explains why it was able to get a much richer data set, so the lack of continuity may have been, in part, a function of sampling issues in the earlier studies. The abstract is as follows:
INTRODUCTION: There are a limited number of studies from Latin America that have included native populations mainly due to geographic limitations and ethical considerations when interacting with native communities to enroll them in research studies. This project, initiated by the Peruvian National Institute of Health in 2010, represents the first major effort at studying specific native and mestizo communities across Peru, including demographic history, and population migration patterns forming the basis of precision genomic medicine for the Peruvian people. A total of 953 participants from 30 communities (13 mestizo and 17 native) were enrolled in this study, of which we have generated genotype array data for 171 and high coverage whole genome sequence data on 150. 
RESULTS: Our cohort of individuals, including the mestizos, has a much greater contribution from Native American ancestry than previous large-scale sequencing studies. We were able to construct a migration and diversity topography of Peru, which revealed that major cities harbor a high degree of genetic diversity independent of European contributions. Further, using identify-by-descent networks, we illustrate that during pre-Inca, Inca, and Spanish administration the Andean region was central to the population structure of Peru, while post-Spanish independence, the population dynamics seem to shift towards the coast, consistent with history. This is also consistent with high altitude adaptation leading to reduced gene flow into the Andes. We also demonstrate fine-scale population structure within the mestizo communities, identifying admixture between Native American communities in addition to an increase European contribution, both of which need to be considered when performing GWAS and personalized medicine programs. 
CONCLUSION: The genetic background of Native Americans has spread worldwide including in United States. It is expected that almost one million immigrants will come from Andean countries with high levels of Native American background. This population will be classified as “Hispano/Latino” in many studies in United States, but this group is a mixture of individuals with different levels of Native American ancestry and from different Native American sources. For this, we need to better understand the genetic variation and architecture of these native populations, migration patterns, and ethno geographical studies.
2309 Admixture between Finns and Estonians. The abstract is as follows:
Ancestry information at the individual level is a resource for personalized medicine, demographical and history research, and for tracing back personal history. We report a new method for quantitatively determining personal genetic ancestry based on genome-wide data.

Numerical ancestry component scores are assigned to individuals based on comparisons with reference populations. These comparisons are conducted with an existing analytical pipeline making use of genotype phasing, similarity matrix computation and our addition - multidimensional best fitting by MixFit. The method is demonstrated by studying Estonian and Finnish populations in geographical context.

We show differences in the genetic composition of these close European populations and how they have influenced each other. We determine ancestry component distribution by geographical region for Estonia and Finland to highlight how these populations have interacted with their neighbors. Sorting the individuals by the birth date allows investigation of time-dependent trends in ancestry component distribution. We perform association analyses between ancestry components and anthropometric traits and report several associations.

Our analytical methods apply to studying specific individuals but can be extended to population studies. We map the ancestral composition of Estonia and Finland in the geographical and historical context. The analytical pipeline has been published, MixFit is available at www.geenivaramu.ee/en/tools/mixfit.
2310 Mongolian DNA and Mongolian introgression into Finns ca. 13th century CE. Eurogenes discusses the abstract here. Basically, he argues convincing that a direct Mongolian to Finnish introgression of the magnitude suggested in impossible in historical terms and contraindicated by uniparental genetics. Instead, the Finns like have Siberian autosomal genetic admixture also appearing in the putative source population. At fault is a lack of sufficient interdisciplinary background in geneticists to realize the follow of the hypothesis as suggested to prompt them to propose a more plausible narrative. This lack of interdisciplinary background seems to be a particular problem in Chinese geneticists. The abstract is as follows:
Here, we present the whole-genome sequencing data for 386 Mongolian individuals which mainly composed of the Buryats and the Khalkha Mongols by average sequencing depth of 17X. We discovered 3.8 million novel single nucleotide polymorphisms (SNPs) which were not previously reported. Moreover, 965,663 SNPs which were rare (minor allele frequency < 0.5%) in the 1000 genome project phase 3 were low frequency or common in our dataset. Series of analysis demonstrated distinctive population structure of the Mongolians from the East Asians. We constructed robust imputation panel for Northern Asian populations which produces great imputation accuracy for rare (mean r2 of 0.82) and low frequency (mean r2 of 0.87) SNPs. We identified significant gene flow from the Buryats to the Finnish which was predicted to be occurred in 1,228 (±87) year. Moreover, 13.38% of Buryat admixture was predicted in the Finnish genome. In summary, this study illustrated advantage of whole-genome sequencing to build reference panels and to study population history.
2311 Estonian population genetics. The main distinction within Estonia is between maritime and inland populations.

2313 Malaysian indigenous populations grew rapidly more distinct despite recent (ca. 2000-4000 years) divergence due to natural selection

2314 In Kashmir, India contrary to the male dominated migrant assumption, there is great mtDNA diversity. Kashmir location put in directly in the path of many historic folk migrations. This is in contrast to other areas in India where mtDNA appears far less changed even at a quite deep time depth than Y-DNA which show male dominated influence from Indo-Aryans. One could probably use distinctions between Y-DNA and mtDNA mixes on a geographic basis to show different modes of migration/conquest. Steppe Indo-Europeans were patrilocal with high levels of female mobility at marriage over considerable distances. Kashmir may have been within the territory where this pattern prevailed, while Southern India may have had more of a male dominated military conquest model.

2315 Census population v. effective population size in two Lithuanian subpopulations, which genetic methods inaccurate assume are closer to each other than they are in reality.

2316 Evidence of natural selection related to disease resistance in Chileans

2317 Fine scale population structure in Han Chinese populations focused on haplogroups related to diseases with a 20,000 person sample size.

2322 Sardinia's population genetic stability since the Neolithic is confirmed with a set of ancient DNA from all intervening time periods with eight ancient DNA samples. Sometimes pots are people and sometimes they aren't. The abstract is as follows:
Ancient DNA (aDNA) has provided a powerful tool for assessing the temporal stability of populations within a geographic locale. In much of mainland Europe, aDNA has revealed a relatively dynamic population history from the Neolithic period through the Bronze Age. Due to their relative isolation, island populations may in general not experience the same population dynamics as mainland populations. The island population of Sardinia in particular has been hypothesized to have a relatively stable and continuous population from the early Neolithic, largely on the basis of modern Sardinian DNA. Here we directly assess continuity using genome-wide capture data (~1.2 millions SNPs) of 26 ancient humans from the island of Sardinia spanning the Neolithic, Copper Age, and Bronze Age, including individuals from the Nuragic culture. Through analyzing read-level DNA damage patterns and estimating modern contamination levels, we authenticate these ancient DNA sequences and removed outlier loci and individuals for downstream analyses. Projecting these ancient individuals on to modern axes of genetic variation, as defined by principal component analysis on a large-scale reference dataset of modern human populations from Sardinia, Europe and the Middle East, reveals no obvious temporal structure within Sardinia within this long time frame. Consistent with previous hypotheses of early migrations in Europe, we observe clustering of these ancient individuals with previously published sequences of ancient humans associated with an 'early farming' Neolithic culture. Through the application of multiple population genetic methods and exploratory data analysis tools we find that relative to mainland Europe there has been population stability within the island of Sardinia. Beyond shedding light on Sardinian population history, the relative stability we infer is important for understanding the local frequencies of disease susceptibility alleles.
2323 Decline in effective population and uniparental haplotype diversity in Native Americans quantified. Genetic diversity declined by less than census population (on the order of 50% loss of diversity v. 90% loss of census population), as one would reasonably expect. This data point is a good one to use to compare predictions about losses of genetic diversity in human populations during bottlenecks and a real world example of it.

2324 Origins of genetically Polynesian people in Brazil who died in 19th century before documented migrations redetermined using better reference sets. Original determination had been Cook Islands. The abstract doesn't tell us what the new data concluded, just that the old determination was flawed due to its limited data set. I do not like it when abstracts tease like that, rather than spilling the beans. The abstract is as follows:
In 2010, mitochondrial DNA extracted from two ancient skulls found in southern Brazil belonged to the haplogroup B4a1a1a, exclusively found in Polynesia. Radiocarbon analyses indicated that these individuals most probably died before the 19th century, prior to any registered transport of Polynesian people to South America by European vessels. Further genome-wide analyses showed a complete Polynesian ancestry for both samples, with Cook Islands as the closest source population. However, scarcity of genotyping data from modern Polynesian populations posed a major limitation for inferring a more specific place of origin for said skulls. Here, we re-analyze these ancient DNA samples using an extended reference panel that comprises over 475 genotyped samples from 18 different locations across the Pacific Ocean. With this data we explore the genetic affinities of the Botocudo skulls at a finer scale to potentially pinpoint their genetic origin, and we demonstrate the importance of assembling diverse genetic reference panels to shed light on the evolutionary past of human remains devoid of archeological context.
2353 New determination of overall mutation rate in humans. The 1.2 figure reached is on a low side within the range of estimations made in the past.

2354 Mutation rates vary across genome by type of gene. This poster presentation is more persuasive than 2353 in my view, in accord with what I know about past research on mutation rates.

2357 Easter Islanders have pre-Columbian Native American ancestry but it is derived from other Polynesian islands from which it was settled and not directly. This is at least the third investigation on the topic, with one prior study concluding that there was direct pre-Columbian Native American introgression into Easter Islanders and a subsequent paper  that is basically a rebuttal arguing that the introgression was post-Columbian. The abstract is as follows:
Reaching its easternmost extent in Easter Island less than one thousand years ago, the settlement of Polynesia represents a final chapter in the dispersal of humans across the globe. Though it occurred relatively recently, much remains unknown about this unique oceanic process in historical population genetics, including the sequences of islands settled, the timing of settlements, and the origins of the settlers. Using dense genome-wide SNP array data of 445 modern samples from seventeen key islands (or island clusters) spanning remote Oceania, we infer settlement patterns stretching across the Pacific to Easter Island, and we address the long lingering suggestion that Native Americans could have played a role in the ancient population history of Easter Island. We confirm this theory, but we show that the ancient contact with Native Americans took place in Polynesian “up stream” of Easter Island, that is, before its settlement, and that this component was then carried to Easter Island by its admixed founders. We use three different timing analyses, as well as evidence of differential origins within the Americas of distinct Native American genetic signatures that we find present in Polynesia, to support our conclusions. The isolated Polynesian islands, separated by vast Pacific distances, have provided us with a uniquely structured canvas on which to implement novel variants of ancestry deconvolution and migration analysis techniques; we will describe these techniques, which could be useful for analyzing similarly isolated populations from other understudied regions of the world. Our results demonstrate the important role that both recent and ancient admixture events have played in creating the rich diversity patterns that define modern Polynesian populations.
2359 Lakshadweep Islands of the coast of India were settled mostly from Kerala state. The abstract is as follows:
Archipelago Lakshadweep resides in a south-west part of India in the Arabian Sea. In addition to its geographical isolation, the gene pool of these islands encompasses the signatures of ancient human dispersal across the South Asian (SA) corridor. In order to reconstruct the population history of Lakshadweep population, we have analysed uniparental (mtDNA and Y chromosome) and biparental (750K autosomal loci) markers among 1359 individuals belonging to several ethnic groups of Lakshadweep Islands. We observed the overwhelming presence of mitochondrial haplogroup R30, whereas the Y chromosome major haplogroups were P267-R2a (16%), Y495-R1a2b (12%). Both mtDNA and Y chromosome signals showed a close genetic link between Lakshadweep populations with the mainland Indian populations. The allele frequency and haplotype-based autosomal analyses suggested their closest affinity with the Southern Indian state Kerala.
2360 Japan has four genetic clusters, one associated with Japanese language speakers, one associated with Ryukuan language speakers, one Korean and one Amami (an island chain between the main islands of Japan and Okinawa). Surprising. I expected an Ainu cluster. I also expected a a division within Japan between an Ainu shifted Northern Japan and a Southern Japanese cluster which were integrated under the central government at different times. And, I am surprised that Ryukuan and Amami are not in the same cluster. The abstract is as follows:
Aims the Japanese population has been known to be grouped into super-clusters, which are Mainland and Ryukyu clusters, in the population genetics. Investigation of the super-clusters in the detail was very important to perform genetic disease association studies and reveal demographic history in the Japanese population. We examined the Japanese population substructure using genome wide genotyping data from the Japan Multi-Institutional Collaborative Cohort Study (J-MICC), including south west islands of Japan. 
Methods A total of 14,539 study subjects from the 12 areas of the J-MICC study including the Ishigaki and Amami Islands were genotyped at RIKEN Center for Integrative Medicine using a HumanOmniExpressExome-8 v1.2 BeadChip array. Subjects with discordant sex information and close relationship pairs were removed. First, Principal component analysis (PCA) with J-MICC subjects, 1000 genomes East Asian (EAS), and Pan-Asian SNP (PASNP) consortium genotype data were performed to examine the population substructure of east Asian including J-MICC subjects. We also performed ADMIXTURE analysis for J-MICC subjects. 
Results PCA with J-MICC subjects, 1000 genomes EAS, and Pan-Asian EAS showed three clusters (Mainland, Ryukyu, and Korean clusters) from J-MICC subjects. Of the J-MICC subjects, 0.3% were assigned to the Korean cluster. PCA with only J-MICC subjects further indicated the Amami cluster. The Amami cluster existed closest to Ryukyu cluster, but didn’t exist closest to Korean and Taiwan populations. The result of ADMIXTURE analysis revealed that the minimum value of the cross-validation (CV) error was observed when the assumed number of ancestral populations (K) was 8. 
Conclusion We examined the population substructure of J-MICC study, and identified four sub-clusters (Mainland, Ryukyu, Amami, and Korean clusters). ADMIXTURE analysis indicated that the K=8 was minimum value of the CV error. Thus, the result suggests the possibility that more genetic clusters exist in Japanese population. These results are expected to be important information to do genetic disease association studies with the use of the Japanese population data.
2362 Genetics of Central Mexico. There was less European ancestry in a rural Mexican town than in urban areas, and was only one admixture event with European there. The abstract is as follows:
The Spanish colonization of Mexico led to the creation of new communities containing individuals who had been born in three different continents (North America, Europe, and Africa) and who came from diverse cultural and linguistic backgrounds. Major population collapses also occurred in indigenous communities following colonization of this region. While previous research has explored these events in Mexico City and other urban areas, their impact on rural populations remains unclear. In Xaltocan, a small town in central Mexico, historical documents suggest that the Spanish rarely visited the town. To better understand the demographic history of this region and the genetic effects of Spanish colonial history, we collected samples from 47 present-day residents from Xaltocan. The samples were genotyped for >600,000 genome-wide single nucleotide polymorphisms (SNPs) using the Affymetrix Axiom® Human Origins Array. Individual ancestry estimates were calculated using ADMIXTURE and RFMix. Estimates of the number and timing of admixture events at Xaltocan were calculated using TreeMix and Tracts. Past changes in population size were modeled using dadi. We find a much lower average proportion of European ancestry at Xaltocan compared to previously sampled populations in the Americas. We also find evidence that a single admixture event between European and Native American source populations affected our study population at Xaltocan.
2365 Genetics of Brittany - matches historic rather than current boundaries. Given its state of economic development, French DNA studies are rare, so it fills an important gap. Also Brittany is historically important as a source of Norman invaders to England and of Norman troops who went to Italy and participated in the Crusades. Also a good illustration of the fact that political boundaries influence mating patterns in scientifically demonstrable ways. The abstract is as follows:
Background 
The genetic structure of human populations varies throughout the world, being influenced by migration, admixture, natural selection and genetic drift. Characterising such genetic variation can provide insight into demographical history and informs research on disease association studies, especially on rare recent variants. In this study, we examine the fine-scale genetic structure of Brittany and surrounding regions of France. 
Brittany is a region in the north-west of France, historically and culturally distinctive. Genetic proximity between Bretons and Irish has been shown in [1]. Currently, administrative Brittany covers only 80% of historical Brittany. Southern limits of historical Brittany extend further than the Loire River, the biggest physical barrier in the region. Eastern limits do not coincide with any significant geographical feature, potential genetic barrier could be thus a result of cultural and historical differences. 
Methods and Results 
We genotyped 1005 individuals from North-Western France, with at least three of their grandparents born within a 15 kilometres distance using Axiom™ Precision Medicine Research Array. Principal Components analysis revealed a high correlation between geographical position and components (p-value < 2e-16). Visualisation of PC1 (0.16 % of variance) on the map points to three subpopulations: one in the south of Loire River and two in the north, one of which overlaps with historical Brittany. Partial Mantel tests confirm that genetic differentiation is not uniform. We also approximate eastern border of “genetic Brittany” based on ADMIXTURE results and test the strength of the barriers with Fst statistic. Southern border, corresponding to Loire River, is more pronounced. 
Conclusion 
We here report both evidence for isolation by distance within at a very fine level and existence of two genetic barriers, the Loire River and the historical boundary of Brittany Duchy. Subsequently, we will verify and extend our findings with fineSTRUCTURE software and with analysis based on Identity By Descent. This fine-scale population structure may have consequence in association analyses, especially for rare variants which tend to be geographically clustered. These results support the need for a genetically matched panel of controls in gene mapping analyses in French population. 
1. Karakachoff, M. et al. Fine-scale human genetic structure in Western France. Eur J Hum Genet 23, 831–836 (2015).
2366 Genetics of Ireland - about 70% of territory is mostly Gaelic, about 30% shows admixture with Norman and Norwegian ancestry being most notable. Lines up with historical events.

2368 Genomic health of ancient hominins. The abstract is as follows:
The genomes of ancient humans, Neandertals, and Denisovans contain many alleles that influence disease risks. Using genotypes at 3180 disease-associated loci, we estimated the disease burden of 147 ancient genomes. After correcting for missing data, genetic risk scores were generated for nine disease categories and the set of all combined diseases. These genetic risk scores were used to examine the effects of different types of subsistence, geography, and sample age on the number of risk alleles in each ancient genome. On a broad scale, hereditary disease risks are similar for ancient hominins and modern-day humans, and the GRS percentiles of ancient individuals span the full range of what is observed in present day individuals. In addition, there is evidence that ancient pastoralists may have had healthier genomes than hunter-gatherers and agriculturalists. We also observed a temporal trend whereby genomes from the recent past are more likely to be healthier than genomes from the deep past. This calls into question the idea that modern lifestyles have caused genetic load to increase over time. Focusing on individual genomes, we find that the overall genomic health of the Altai Neandertal is worse than 97% of present day humans and that Ötzi the Tyrolean Iceman had a genetic predisposition to gastrointestinal and cardiovascular diseases. As demonstrated by this work, ancient genomes afford us new opportunities to diagnose past human health, which has previously been limited by the quality and completeness of remains.
2724 Brazilian slaves were almost all from Benin or Bantu with further detail as well about the pre-Benin port source of these individuals in the West African interior.

2946 Knockout disorders in Ashkenazi Jews informed by population history, including the 13th century bottleneck in that population (presumably due to pogroms related to the Crusades, although this isn't entirely clear and disease and the Little Ice Age could also have contributed either directly, or by triggering the pogroms, or both).

4 comments:

Ebizur said...

"2360 Japan has four genetic clusters, two per language, plus Korean and Amani"

Amani must be an error for Amami. That is a name for a group of islands located in the sea between Kyūshū and Okinawa and for a Japonic language (or, if one is a lumper rather than a splitter, a very isolated and divergent Japanese dialect that itself has significant internal variation) spoken by natives of those islands.

To which "two languages" have they referred with the phrase "four genetic clusters, two per language"? Since they seem to have mentioned Amami as a separate genetic cluster, I suppose two of them should be closer to mainstream Japanese than the people of Amami, so perhaps Eastern (Kantō, etc.) Japanese vs. Western (Kansai, etc.) Japanese. The other two clusters might be southern/western (and presumably more Japanese-influenced) Ainu vs. northern/eastern (and presumably more Tungus-, Nivkh-, and/or Chukotko-Kamchatkan-influenced) Ainu. Another interpretation might take the other "two [genetic clusters] per language" to be northern Ryukyuan (Okinawa proper and closely associated islands) and southern Ryukyuan (Sakishima) as two distinguishable genetic clusters comprised of people who speak dialects of the Ryukyuan language, but it would be awkward in my opinion not to associate Amami with the other two Ryukyuan subgroups. (The languages/dialects of Amami are generally considered to form a clade with Okinawan as opposed to Sakishima, and Amami-Okinawan is considered to form a clade with Sakishima as opposed to Japanese proper. Thus, it would be quite strange to group Okinawan and Sakishima together as one language and then mention the people of Amami as a separate group. Therefore, I think it is most likely that the authors have studied samples of Japanese proper, Ainu, Amami, and Korean people, and that they have distinguished two clusters each among their samples of Japanese proper and their samples of Ainu.)

andrew said...

Thanks for catching the typo. Fixed. Fat fingers, no copy editors, it happens and I try to fix them when I can.

The two languages I intended to refer to were "mainstream" Japanese and the Ryukyuan language.

The intent was to say that the four clusters consisted of one associated with Japanese, one associated with Ryukuan, a Korean cluster (without regard to language) and a Amami cluster (without regard to language). The wording I used was apparently ambiguous. I suspect that almost all of the Koreans cluster members are fluent in Japanese, and I suspect that there would be people in the Amami cluster who did not have that language as their primary one (probably generations splits). AFAIK, their study must have lacked an Ainu sample as I was expecting that to be a clear genetic cluster and it was absent from their list of clusters.

I'm also surprised that there aren't at least two clusters of Japanese speakers genetically, one in the South of the main Japanese islands that was part of Japan politically at an earlier date and one in the North which is Ainu shifted as has been shown in other studies I've seen and blogged about. Further, I am surprised that the Rukukuan and Amami clusters are distinct enough from each other to cluster separately.

If you click on the link and enter in the publication number in the search area you can read the abstract for this poster presentation yourself.

Ebizur said...

"Methods A total of 14,539 study subjects from the 12 areas of the J-MICC study including the Ishigaki and Amami Islands were genotyped at RIKEN Center for Integrative Medicine using a HumanOmniExpressExome-8 v1.2 BeadChip array."

Ishigaki is part of the Sakishima island group. The Sakishima islands are located in the sea between Okinawa proper and Taiwan, about halfway between those larger islands in the case of Miyako and somewhat closer to Taiwan in the case of Ishigaki, Iriomote, and especially Yonaguni.

Amami is both geographically and linguistically closer to Okinawa proper than either of those is to Ishigaki. Therefore, I would expect samples from Okinawa proper to cluster with the samples from Amami rather than the samples from Ishigaki.

If Ishigaki and Amami are counted as two of "the 12 areas of the J-MICC study," what are the other ten areas?

The authors have stated in the abstract that 0.3% of the J-MICC subjects have been assigned to a "Korean" cluster. People of Korean nationality currently account for about 0.38% of the population of native-born people in Japan. If one includes all individuals of primarily Korean ancestry who are in Japan for whatever reason regardless of birthplace, the figure should be closer to 0.7% of all people in Japan at present. That does not include recently admixed people of e.g. half Japanese and half Korean ancestry, of which there are also quite many. Anyway, the percentage of this group's samples that have been assigned to the "Korean" cluster (0.3%) is somewhat lower than I would expect. I wonder whether they have limited their sampling to people who possess Japanese nationality/citizenship, excluding individuals who have retained Korean nationality.

Anyway, I see now that you have used the expression "four genetic clusters, two per language" to mean the four genetic clusters of Japanese proper, "Ryukyuan" (just Ishigaki, or Ishigaki plus people from some other island?), Amami, and Korean and the two languages of Japanese proper (spoken by the Japanese proper and by the Koreans in Japan) and Ryukyuan (spoken by the people of Ishigaki (+ x?) and Amami). It was quite confusing when you had summarized the abstract with just one sentence because Koreans would normally not be associated with the Japanese language outside the peculiar context of this study, in which some individuals of recent Korean extraction who are living in Japan and speaking the Japanese language appear to have been sampled.

andrew said...

"Anyway, I see now that you have used the expression "four genetic clusters, two per language" to mean the four genetic clusters of Japanese proper, "Ryukyuan" (just Ishigaki, or Ishigaki plus people from some other island?), Amami, and Korean and the two languages of Japanese proper (spoken by the Japanese proper and by the Koreans in Japan) and Ryukyuan (spoken by the people of Ishigaki (+ x?) and Amami)."

I agree that it was confusing. But there were jut four clusters not six. Japanese, Ryukuan, Korean and Amami. When I said "two per language" I mean "two associated with particular languages".