A new paper in PloS Genetics sheds some light on issues which we were already familiar with through conventional history, Ancestral Components of Admixed Genomes in a Mexican Cohort. What we already know: Mexicans and people of Mexican descent predominantly derive from an admixture event(s) between Europeans and Amerindians, with a minor African component. The last is often a surprise to Mexicans themselves, but it is no surprise to those who are aware of the nature of Spanish colonialism in the New World. In some cases, such as in Cuba, the African slave economy which we're familiar with the United States was the norm, but in many instances African slaves accompanied Spaniards as secondaries in their conquest of the indigenous populations. New Spain was a caste society with a Spaniard and Creole elite, and a productive base of indios from whom they extracted rents. But Africans served as junior partners to the European elites, and were a substantial demographic presence down to the 19th century. Their near total genetic absorption though seems to have resulted in their near elimination from the cultural folk memory of Mexico. Most of the techniques in the paper should be somewhat familiar to you. In particular, there's a lot of PCA, as well as some model-based clustering methods. The PCA takes all the genetic variation in the data set, and reduces it down to large independent dimensions which you can visualize on a two dimensional plot (e.g.., PC 1 vs. PC 2 represents the largest explanatory dimension vs. the second largest). It turns out that most of the largest dimensions of variation are pretty well explained by our intuitions of genetic distance. The model-based approaches are different. Instead of letting the algorithm generate the clusters hypothesis free (i.e., you put labels on the clusters after the fact) you specify a number of populations, K, and the method forces the data you input to fit that parameter. In other words, it's kind of like a sausage. Sometimes the fit is good, and sometimes not so good (if you try and divide Swedes into 20 distinct populations, the algorithm will try and comply, but it should really tell you that's you're being crazy). But another way to go is to look at the structure of the genome itself in methods which focus on correlations across the chromosomes. While PCA and model-based methods can give you an intuition as to the average admixture of an individual, more fine-grained genomic methods which assign ancestry to segments across an individual's genotype yield more information. To get a better sense, here are two graphics generated from 23andMe's Ancestry Painting.