The Molecular Ecologist: Genes … in … space!

(A) Geography, and (B) genetics. Figure 2 from Wang et al. (2012).

I’ve got a new post up over at The Molecular Ecologist, discussing a new paper that tries to take a quantitative approach to a phenomenon that keeps turning up in human population genomic datasets, in which genetic data mirrors the geography of the places it was collected.

It’s something of a classic result in human population genomics: Go out and genotype thousands of people at thousands of genetic markers. (This is getting easier to do every day.) Then summarize the genetic variation at your thousands of markers using Principal Components Analysis, which is a method for transforming that genetic data set into values on several statistically dependent “PC axes.” Plot the transformed summary values for each of your hundreds of samples on the first two such PC axes, and you’ll probably see that the scatterplot looks strikingly like the map of the places where you collected the samples.

Of course “looks strikingly like” is not a very quantitative statement. To see how the new study deals with that problem, go read the whole thing. And yes, I manage to shoehorn in a reference to the Muppets.◼

The Molecular Ecologist: Isolating isolation by distance

Linanthus parryae population Linanthus parryae. Photo by naomi_bot.

And now I present my first “real” post as a contributor at the Molecular Ecologist, a discussion of a new review article pointing out that population geneticists aren’t doing a great job dealing with one of the best-known patterns in population genetics, isolation by distance, or IBD. You may recall that I discussed IBD in a more historical context way back in the day on this very website. It’s simply a pattern in which populations located close to each other are more genetically similar than populations farther away from each other, which arises because most critters (or their seeds, or larvae, or pollen) are less likely to move longer distances. But IBD can be conflated with a number of other patterns population geneticists often try to detect:

So let’s say you’ve collected genetic data from sites on either side of a line you think might be biologically significant—a pretty standard-issue population genetics study. You run your data through Structure, and find two clusters of collection sites that line up pretty well with that Line of Hypothesized Biological Significance. As a followup, you conduct an AMOVA with the collection sites grouped according to their placement by Structure, and you find that the clusters explain a significant fraction of the total genetic variation in your data set. Therefore, you conclude that the LHBS is, in fact, a significant barrier to dispersal.

Except that as we’ve just discussed, everything you’ve just found could be a consequence of simple IBD plus the fact that you’ve structured your sampling so that your LHBS happens to bisect the landscape you’re studying. And just to add to the frustration, even if you’d started out by testing for IBD before you started with all of the tests for population structure, a significant result in a Mantel test for IBD wouldn’t necessarily mean that population structure wasn’t there.

To find out how the author of the new review article suggests we deal with the complications outlined above, go read the whole thing.◼

Is corn the new milk? Evolutionarily speaking, that is.

colorful fall corn

Corn. (Flickr: srqpix)

ResearchBlogging.orgIt is a widespread misconception that, as we developed the technology to reshape our environment to our preferences, human beings neutralized the power of natural selection. Quite the opposite is true: some of the best-known examples of recent evolutionary change in humans are attributable to technology. People who colonized high-altitude environments were selected for tolerance of low-oxygen conditions in the high Himalayas and Andes; populations that have historically raised cattle for milk evolved the ability to digest milk sugars as adults.

A recent study of population genetics in Native American groups suggests that another example is ripening in the experimental fields just a few blocks away from my office at the University of Minnesota: Corn, or maize, may have exerted natural selection on the human populations that first cultivated it.

The target of this new study is an allele called 230Cys, a variant of a gene involved in transporting cholesterol. 230Cys is known only in Native American populations, and it’s associated with abnormally low production of HDL cholesterol (that’s the “good” kind of cholesterol) and thereby increased risk for obesity, diabetes, and heart disease. In Native American populations, the genetic code near 230Cys shows the reduced diversity associated with a selective sweep, which suggests that, although it’s not particuarly helpful now, this variant may have been favored by selection in the past.

One of the biggest dietary changes in the history of Native American humans was the domestication of corn, which provided a staple crop to support settlements across North and South America long before Europeans arrived. However, a staple crop is something of a double-edged sword: it can provide a more predictable food source than hunting and gathering—but if the crop fails, it means famine. It’s been proposed that the 230Cys variant makes people who carry it better at storing food as fat, which might come in handy for ancient farmers who had to weather bad harvests every few years.

2011.08.27 - Corn!

Corn on display at the 2011 Minnesota State Fair. (Flickr: jby)

So the new study looks for an association between frequencies of 230Cys and corn-based agriculture in Native populations from Central and South America. The study’s authors—a big international team from universities in Brazil, Argentina, Mexico, Chile, Costa Rica, France, and Great Britain—first show that there’s a strong correlation between the frequency of Cys230 in Native populations and the length of time that domestic corn has been grown by those populations, as determined by the radiocarbon date of maize pollen found in archeological sites. That is, 230Cys is more common in Native populations that have a longer history of growing corn.

The team also used genetic data from the vicinity of Cys230 to estimate the age of the allele, and found that it probably originated between 19,000 and 7,000 years ago—which is to say, all the copies of Cys230 in the population genetic sample are descended from a single mutation that occurred after humans colonized the Americas. The lower age estimate is also pretty close to how long ago native populations are thought to have first begun farming maize.

That data makes a pretty good case for 230Cys having arisen as an adaptation to the diet created by Native American corn-based agriculture. But it’s not the whole story, by a long shot. Although 230Cys is strongly associated with metabolic disease in today’s modern, mostly famine-free, lifestyle, it only explains about four percent of variation in blood cholesterol levels. Moreover, it’s not clear to me that agriculture based on maize should be more prone to famine than agriculture based on wheat or rice—so why didn’t European and Asian populations evolve their own versions of 230Cys? It seems much more probable that there are a lot of other genes involved in determining how human bodies respond to modern-day feasting or prehistoric famine.

And, in fact, a 2010 study of world-wide human population genetics found evidence of selection associated with both climate and with diet type across the genome. That study found genetic markers with strong associations to climate and diet in close proximity to genes connected to blood glucose levels, diabetes risk, cancer risk, and, yes, blood cholesterol levels. The climate and dietary categories examined in that study are very broad, however, so it’s hard to know what, specifically, helped create the natural selection suggested by the observed associations between gene variants and evironments.

Corn and 230Cys may be the most recently described specific case of recent human evolution in response to agricultural technology—but we can expect to find a lot more stories like this one as we dig deeper into human population genetics.◼

References

Acuña-Alonzo, V., T. Flores-Dorantes, J. K. Kruit, T. Villarreal-Molina, O. Arellano-Campos, T. Hünemeier, A. Moreno-Estrada, M. G. Ortiz-López, H. Villamil-Ramírez, P. León-Mimila, & et al. (2010). A functional ABCA1 gene variant is associated with low HDL-cholesterol levels and shows evidence of positive selection in Native Americans. Human Molecular Genetics, 19, 2877-85 : 10.1093/hmg/ddq173

Hancock, A. M., D. B. Witonsky, E. Ehler, G. Alkorta-Aranburu, C. Beall, A. Gebremedhin, R. Sukernik, G. Utermann, J. Pritchard, & G. Coop (2010). Human adaptations to diet, subsistence, and ecoregion are due to subtle shifts in allele frequency. Proc. Nat. Acad. Sci. USA., 107, 8924-8930 : 10.1073/pnas.0914625107

Hünemeier, T., C. E. G. Amorim, S. Azevedo, V. Contini, V. Acuña-Alonzo, F. Rothhammer, J.-M. Dugoujon, S. Mazières, R. Barrantes, M. T. Villarreal-Molina, & et al. (2012). Evolutionary responses to a constructed niche: Ancient Mesoamericans as a model of gene-culture coevolution. PLoS ONE, 7 : 10.1371/journal.pone.0038862

They also serve: Adaptation from standing variation

Standing and waiting. Photo by Image Zen.

ResearchBlogging.orgEver since Charles Darwin and Alfred Russell Wallace first described the workings of natural selection, one popular way to summarize about selective change has gone something like this: A population of critters is well-adapted to its environment until that environment changes—maybe the critters move to a new climate, maybe the climate changes on them, maybe some new competitors or predators move in. Life gets harder for our critters, until one of them is born … different. That lucky mutant has a never-before-seen trait that lets it cope in the new conditions, and in a few generations, every critter in the population is a descendent of that original mutant.

That narrative isn’t wrong. But it does miss one of the key insights that led to the discovery of natural selection—natural populations are variable.

That population of critters encountering new conditions of life may very well not need to wait around for the lucky mutant before it can begin adapting to new conditions. Mutations happen at random, and continuously—and, particularly if they don’t leave the mutant much less fit, can hang around in a population for generations. And this “standing” variation is raw material waiting for natural selection to act.

High-octane fuel for adaptation

There’s good reason to think that natural selection is more efficient when it has standing variation to work with. Joachim Hermisson and Pleuni Pennings demonstrated this principle rather neatly in a 2005 theory paper, in which they modeled the fate of new genetic mutations that had a weak negative effect when they first appeared in a population, but then became beneficial after the population’s environment changed.

Normally, when a new mutation appears in a population, it’s almost immediately lost to the random effects of genetic drift, even if it confers a benefit. This means that a new mutation needs to be quite strongly favored by selection to have a high probability of “fixing,” or spreading through an entire population.

However, under Hermisson and Pennings’s model, the mutations considered are only those that survive the initial effects of drift. The flip-side of the randomness that can make a weakly beneficial mutation disappear can also help a weakly deleterious mutation spread, achieving an equilibrium between drift, selection, and new mutation events that create new copies of the same variant to replace the ones lost to selection or drift. So, when conditions changed, and the mutation became even weakly beneficial, it was ready to start spreading.

Natural selection is more effective when it works with standing variants. Figure 1 from Hermisson and Pennings (2005).

This graph, the key figure from Hermisson and Pennings’s paper, shows the probability that a mutation will “fix,” or spread to dominate the population over the course of several generations, given the power of natural selection (alpha, the term on the horizontal axis). The dotted line tracks the probability of fixation for a brand-new mutation; the solid line tracks probability of fixation for a mutation that existed before selection began to act, and had achieved mutation-selection-drift equilibrium. No matter how strong selection is, the pre-existing mutation is more likely to “fix” than the new mutation—and that difference is most pronounced when selection favoring the mutation is weakest.

In other words, if mutations provide the variation that fuels evolution by natural selection, standing variation is fuel with a substantially higher octane rating.

Harder to spot

But the same features that make adaptation from standing variation so much more efficient also act as a sort of population genetic stealthing. This is because adaptation from standing variation has very different effects on the genetics of an adapting population than the spread of a single new mutation.

The key to this difference is that gene variants, or alleles, aren’t transmitted from one generation to another one at a time. Instead, they come as part of chromosome regions, physically linked to genetic code that may have nothing to do with the function of the focal gene. And population geneticists use that fact to zero in on genetic regions that might have been recently affected by selection.

It’s a little bit like buying LEGO bricks—or, at least, how it used to be when I was still buying a lot of LEGOs, back before you could custom-build your own sets online. Say you want a hundred copies of a particularly special type of LEGO brick, one that’s only available in a single kit. To get those hundred bricks, you need to buy a hundred copies of that one kit. So you end up with a selection of bricks—the ones you wanted, and the ones that came with the ones you wanted—that probably doesn’t have a very wide diversity of brick types.

But suppose you want a hundred copies of a more common LEGO brick, one that’s included in dozens of different kits—kits for pirate ships and castles, race cars and railroads. You might still need to buy a hundred kits, but you can buy many different kinds of kits, and so in addition to the hundred copies of the brick you want, you also have bricks to build anything from a starship to a dragon.

The Dawn Of Man LEGOs, evolving. Photo by Kaptain Kobold.

Selection on a single beneficial mutation is like that first LEGO shopping case, where there’s only one kit containing the brick you want. The one lucky mutation exists with only one “genetic background” of other, associated genetic code, and so when the mutation spreads through the population, a chunk of that background code spreads with it. (At least, until recombination can separate the favored mutation from its background; that takes time, sometimes a lot of time.)

Just as purchasing a hundred copies of the same LEGO kit would leave an obvious mark on the makeup of your brick collection, a selective sweep that starts with a single mutation—what’s called a “hard sweep”—results in a region of genetic code with noticeably lower variation across the population, because everyone is carrying the original lucky mutation plus its associated background.

Figure 4 from Linnen et al. (2009), demonstrates the reduced diversity in a gene region associated with fur color in deer mice. Image from Linnen et al. (2009).

In practice, biologists use this principle in two major ways. First, if a biologist has a particular gene in mind that might have recently experienced selection, she can collect DNA sequence data in the vicinity of that gene for many individuals in a popualtion, and see whether it’s less diverse than it ought to be. This is how Catherine Linnen and her collaborators demonstrated that a population of deer mice living on light-colored soils in the Sand Hills of Nebraska had experienced natural selection for lighter color. In a study [PDF] I’ve discussed previously, the team identified a genetic region that was associated with coat color in the mice, then collected sequence data from that region in mice collected from the light-soil population. Compared to the same genetic region in mice from nearby sites with dark soil, the light-soil mice had markedly less variation in the coat-color region.

Alternatively, biologists who don’t know which genes might have been targeted by natural selection can collect sequence data from a whole lot of gene regions—or even “scan” the whole genome—and compare the diversity at each region. Any region that has lower diversity than most of the other sampled regions may have experienced selection recently, and is probably a good candidate for follow-up study.

But selection froms standing variation doesn’t leave such a clear mark on the genome. It’s more like that second LEGO shopping spree, for a brick found in many different kits. If a useful variant is located on many different genetic backgrounds, than selection can make the variant more common in the population without necessarily reducing the diversity of gene regions near the focal variant. This is called a “soft sweep.” Soft sweeps present a problem for those of us who want to find genes that have recently been affected by natural selection—without the loss of diversity, genetic regions that have undergone soft sweeps may not stand out in the genome as a whole.

Searching for soft sweeps

As we collect and analyze more genome-scale population genetic datasets, biologists are coming around to the idea that easy-to-detect hard sweeps may be the exception [$a], rather than the rule, for evolution in natural populations—in no small part because the evidence of hard sweeps just isn’t there [PDF].

But the absence of hard sweeps doesn’t mean that soft sweeps are going on all over the place instead. For instance, in an (ongoing) analysis I presented [PDF] at the recent Evolution meetings in Ottawa, I examined patterns of diversity in genetic regions close to genetic markers that are very strongly associated with differing climate conditions in the small but awesome wildflower Medicago truncatula—and I found little evidence of recent hard sweeps. Does that mean all those strongly associated gene variants are strongly associated as a result of adaptation from standing variation? Maybe; but some portion of the associations could also be due to population genetic processes like drift and isolation-by-distance—I’m still thinking about ways to kill the soft sweep hypothesis.

Pennings and Hermisson followed up their original theory paper with a study comparing the power of several different statistical tests to detect soft sweeps, and they found some promising results with an approach based on linkage between genetic variants in the vicinity of a favored variant. More recently, Pennings has approached the question of adaptation from standing variation from a somewhat different angle, by studying selective sweeps in human immunodeficiency virus, HIV. The evolution of HIV after it infects a patient, and as it adapts to antiviral drugs, is quite well understood—to the point that virologists know to expect particular mutations to sweep the viral population within a patient who starts taking a particular drug.

In an analysis recently published in PLoS Computational Biology, Pennings found that the virus’s evolution of drug resistance could be based on standing variation in about 6% of patients on a standard anti-viral drug cocktail—which is to say, about 6% of all patients carry viral populations that are primed to evolve drug resistance the moment therapy begins. (Pennings’s lab website has a good explanation of the clinical implications of this result, with video, even.)

Then, at the Ottawa Evolution meetings, Pennings presented [PDF] an examination of HIV genetic samples taken from multiple patients undergoing antiviral treatment. She identified cases when the virus’s adaptation to the drugs was fueled by standing variation or based on a mutation that occurred after the drug treatment started; one resistance mutation evolved to fixation via a soft sweep in eight out of 23 patients. [Correction, 6 Aug 2012: See Pennings’s comment below for a correction on this point; it’s not known whether this particular soft sweep started from standing variation, or whether it’s simply the case that two different mutations with the same effect managed to sweep the population together.]

If evolutionary biologists want to understand how natural selection helped make the living world we see around us today, it looks like we’re going to have to learn to love soft sweeps. We’re still learning how to differentiate the aftermath of soft sweeps from the results of other, non-selective processes. But fortunately, we live in an era when the genome-scale data that may let us untangle this question are increasingly easy to collect.◼

I started working on this post quite a while before the Ottawa Evolution meetings, when I was pleased to meet Pleuni Pennings for the first time. If there are mistakes in what I’ve written above, they’re my own; but I hope she’ll let me know if I’ve made any!

References

Flintoft, L. (2011). Human evolution: Sweep model is swept away. Nature Reviews Genetics, 12, 228-9 DOI: 10.1038/nrg2978

Hermisson, J., & Pennings, P.S. (2005). Soft sweeps: Molecular population genetics of adaptation from standing genetic variation. Genetics, 169 (4), 2335-52 DOI: 10.1534/genetics.104.036947

Hernandez, R. D., J. L. Kelley, E. Elyashiv, S. Melton, A. Auton, G. McVean, G. Sella, & M. Przeworski (2011). Classic selective sweeps were rare in recent human evolution. Science, 331, 920-4 DOI: 10.1126/science.1198878

Linnen, C. R., E. P. Kingsley, J. D. Jensen, & H. E. Hoekstra (2009). On the origin and spread of an adaptive allele in deer mice. Science, 325, 1095-8 DOI: 10.1126/science.1175826

Oleksyk, T. K., M. W. Smith, & S. J. O’Brien (2010). Genome-wide scans for footprints of natural selection. Phil. Trans. Royal Soc. B, 365, 185-205 DOI: 10.1098/rstb.2009.0219

Pennings, P.S. (2012). Standing genetic variation and the evolution of drug resistance in HIV. PLoS Computational Biology, 8 : 10.1371/journal.pcbi.1002527

Pennings, P.S., & J. Hermisson (2006). Soft sweeps III: The signature of positive selection from recurrent mutation. PLoS Genetics, 2 DOI: 10.1371/journal.pgen.0020186.eor

Pritchard, J. K., & A. Di Rienzo (2010). Adaptation—not by sweeps alone Nature Reviews Genetics, 11, 665-7 DOI: 10.1038/nrg2880

Nothing in Biology Makes Sense: Making sense of inbreeding depression

Bighorn sheep. Photo by Noah Reid, via Nothing in Biology Makes Sense.

This week at the collaborative blog Nothing in Biology Makes Sense!, Noah Reid returns to discuss the bane of small, isolated populations: inbreeding depression:

Iconic North American species such as grizzly bears, red-cockaded woodpeckers, and the American burying beetle today inhabit only small fractions of the ranges they occupied only 100 years ago. A result of this fragmentation is that many individuals exist in small, isolated populations. In these populations, a curious phenomenon often emerges, one that can only be understood in light of some basic evolutionary theory.

To find out more about that phenomenon, and when it can become hazardous to a population’s health, go read the whole thing. ◼

Science online, “Look out! Here comes the spider worm,” edition

Good news, everyone! We might finally know what’s killing honeybees. Photo by Max xx.
  • I’ll show you my effective population size if you show me yours. Have humans historically been polygamous? Population genetics tells all. (The Primate Diaries in Exile)
  • Spider worm, spider worm/Does whatever a spider worm does. Biologists have engineered spider genes involved in silk production into silkworms, which will spin much more silk than spiders do. (Wired Science)
  • Unintended consequences, anyone? Eradication of dingoes from parts of southern Australia turns out to have been bad for endangered prey species. (Laelaps; see also my discussion of dingoes and prey diversity)
  • It was a fungus. With a virus. In the, um, conservatory. New analysis of proteins collected from bees in dying colonies points to the cause of recent honeybee declines. (NY Times; original article on PLoS ONE)
  • There’s a horror movie here somewhere. Mosquitoes living in the London Underground may have evolved into a new species. (Thoughtomics)
  • Another one for the list. Evolution Since Darwin, a history of 150 years of biology, looks like a good read. (Dechronization)

And this week, from BBC Earth, prairie dog communication. (Which has nothing whatsoever to do with the fact that this week’s mammalogy lab covered rodents.)

Back to basics: The “Big Four”

ResearchBlogging.orgThe nice thing about a field season away from all regular internet access is that it gives you a real sabbatical of a sort—a chance to reassess plans and set new goals. One of the new goals I set myself this last field season was to introduce a new kind of topic here at Denim and Tweed.

Most of my writing about science at D&T focuses on recently published discoveries in evolution and ecology. It’s fun writing, and it coincides neatly with my regular journal reading, and I intend to keep doing it. But I’ve discovered that when I want to put new work in context, I often need to discuss fundamental concepts of evolutionary biology that aren’t necessarily common knowledge, such as genetic drift or sexual selection. However, I rarely have room to explain these concepts in depth within a blog post devoted to something else.

So maybe the solution is to devote some posts to explaining these “basics.” I’m going to start with a series of posts on the “Big Four” processes of population genetics. These are the four processes that account, in one way or another, for every change in the frequency of genes within natural populations. In other words, the Big Four account for much of evolution itself. They are:

  • Natural selection, changes in gene frequencies due to fitness advantages, or disadvantages, associated with different genes.
  • Mutation, the source of new forms of genes;
  • Genetic drift, or changes in gene frequencies that arise from the way probability works in finite populations; and
  • Migration, or changes in gene frequencies due to the movement of organisms from site to site.

Lay readers may be surprised both by what we know, and what we don’t, about how these four processes operate in nature. Natural selection is relatively easy to measure, and apparently ubiquitous [PDF] in natural populations—but we don’t know how often the resulting short-term changes impact evolution over millions of years. Mutation, the source of variation on which natural selection acts, seems to vary widely among living things. Genetic drift means that a trait can come to dominate a population even if it has no fitness effect—or sometimes a deleterious one. Finally, migration across variable landscapes can interact with selection, drift, and mutation [$a] to completely alter their effects.

I’ll devote one post each to selection, mutation, drift, and migration, discussing classic findings as well as more recent scientific discoveries about each. They’ll arrive as my usual mid-week science posts for the next four weeks, and I’ll update this post with links to the others as they go online—so if this looks worth following, you can either bookmark this post, or subscribe to D&T’s RSS Feed.

Natural selection, mutation, genetic drift, and migration act together to shape the evolution of natural populations. Photo by jby.

References

Drake JW, Charlesworth B, Charlesworth D, & Crow JF (1998). Rates of spontaneous mutation. Genetics, 148 (4), 1667-86 PMID: 9560386

Kingsolver, J., Hoekstra, H., Hoekstra, J., Berrigan, D., Vignieri, S., Hill, C., Hoang, A., Gibert, P., & Beerli, P. (2001). The strength of phenotypic selection in natural populations. The American Naturalist, 157 (3), 245-61 DOI: 10.1086/319193

Slatkin, M. (1987). Gene flow and the geographic structure of natural populations. Science, 236 (4803), 787-92 DOI: 10.1126/science.3576198

Wright S (1931). Evolution in Mendelian populations. Genetics, 16 (2), 97-159 PMID: 17246615