I’ve got a new post up over at The Molecular Ecologist, discussing a new paper that tries to take a quantitative approach to a phenomenon that keeps turning up in human population genomic datasets, in which genetic data mirrors the geography of the places it was collected.
It’s something of a classic result in human population genomics: Go out and genotype thousands of people at thousands of genetic markers. (This is getting easier to do every day.) Then summarize the genetic variation at your thousands of markers using Principal Components Analysis, which is a method for transforming that genetic data set into values on several statistically dependent “PC axes.” Plot the transformed summary values for each of your hundreds of samples on the first two such PC axes, and you’ll probably see that the scatterplot looks strikingly like the map of the places where you collected the samples.
Of course “looks strikingly like” is not a very quantitative statement. To see how the new study deals with that problem, go read the whole thing. And yes, I manage to shoehorn in a reference to the Muppets.◼