Over the past few weeks I've been looking closely at all the skin color related genes in humans which have been studied over the past few years. A little over two years ago the evolutionary biologist Armand Leroi wrote:
We don't know what the differences are between white skin and black skin, European skin versus African skin. What I mean is we don't know what the genetic basis of that is. This is actually amazing. I mean, here's a trait, trivial as it may be, about which wars have been fought, which is one of the great fault lines in society, around which people construct their identities as nothing else. And yet we haven't the foggiest idea what the genetic basis of this is. It's amazing. Why is that?
Armand wrote that in the spring of 2005. In December of that year a paper was published in Science, SLC24A5, a Putative Cation Exchanger, Affects Pigmentation in Zebrafish and Humans. The authors concluded that:
Based on the average pigmentation difference between European-Americans and African-Americans of about 30 melanin units, our results suggest that SLC24A5 explains between 25 and 38% of the European-African difference in skin melanin index.
SLC24A5 is manifested in disjoint polymorphisms in Africans and Europeans. At one location on this gene where almost all Europeans have an adenine, almost all Africans have a guanine. For African Americans the ratio of A/G is 0.19 to 0.81 according to one survey. For Mexican Americans it is 0.50 to 0.50. As it happens, these ratios are very close to what you would predict from the known admixture between ancestral groups for these populations. While Europeans are fixed for the A variant, Africans and East Asians are fixed for the G variant. The ratio for Mexican Americans makes sense if one assumes that their Amerindian ancestors carried the G polymorphism while their Iberian ancestors carried the A, just as the minority of African Americans' ancestry which is European would pass on the A variant. The A variant is derived, while the G is ancestral. That is, a mutation from G → A occurred at some point in the past, and A increased in frequency in a subset of the world's populations. When did the increase in frequency occur? It seems fairly recently, Voight et. al. picked up a signature of selection around SLC24A5 though it is nearly fixed in Europeans and so is not an ideal target for their methods (which tend to be best at detecting partial sweeps). This would imply a time scale of less than 10,000 years. Another researcher has reported that the SNPs around SLC24A5 imply a selective event as recent as 5,800 years before the present! I've heard from people looking at the frequencies on other populations besides Europeans that a) it is extant at high frequencies across North Africa and Western Asia and b) there is a signature of selection. There is of course one population which I've talked about in regards to SLC24A5 already, and that's South Asians. It seems that about 1/3 of the variation in skin color within this group can be explained by polymorphism on this gene; that's around the same range as the between-group difference for Africans and Europeans. That makes sense since South Asians range from brunette white to nearly black in complexion, with a modal value of medium brown. The A/G ratio for South Asians was 0.820/0.180 in one survey; which included Tamils, Gujaratis, and Telugus. Another study gave the following ratios for Sinhalese and Tamils from Sri Lanka for A/G: 0.50/0.50 and 0.293/0.707 respectively. Finally, the South Asian genome association study has the following ratios for the lightest and darkest quintiles respectively for A/G: 0.90/0.10 and 0.51/0.49. The population was pooled across Punjabis, Gujaratis, Bengalis and Sri Lankans. The figures for South Asia have me scratching my head a little. It seems pretty clear that SLC24A5 has an association with skin color; it works across Africans and Europeans and it works within South Asians. Nevertheless, why exactly does it exist at such high frequencies as far south as Sri Lanka? And why didn't it make it to East Asia? Are the South Asian frequencies the result of a selective sweep which started to the north; or is this an endogenous allele which fixed in Europe later? (one would to look at the markers around the South Asian loss of function variant) It does have a northwest to southeast gradient in South Asia, that seems pretty evident. Here are the numbers for the lightest and darkest quintiles by region:
This seems to comport well with what we would expect phenotypically. So why did SLC24A5 rise in frequency across Southern Asia over the past 10,000 years? I have been told that there are signatures of recent selection among some West Asian populations for this gene; I don't assume that selection would be more ancient to the south. Could it be changes in diet? Could it be wearing clothes? Could it be infection? I really don't know. Note: One of the studies above implies that the high frequencies of the A variant in Sri Lanka is being subject to purifying selection. That is, its frequency is decreasing over time. I doubt that; the implication is that South Asians moved down from the north, but the preponderance of data implies that the populations extant within the Indian subcontinent were there 10,000 years ago. That would be before the derived variant of SLC24A5 started increasing in frequency in all likelihood.
Darkest 20%Lightest 20%
Northwest75247
East235100
South752