Postdoc in genetics of complex traits

2012.10.22 - Medicago truncatula Your new favorite plant? Photo by jby.

Do you like evolution, genetics, and evolutionary genetics? Would you like to think of things to do with a whole lot of genetic data and a flagship model legume? Well, my boss, Peter Tiffin, is looking for another postdoc. Here’s the post description from EvolDir:

I have available a post-doctoral position to work on association and evolutionary genomics of the model legume Medicago truncatula. Collaborators and I have recently collected genome sequence for > 200 accessions and have used these data for GWAS and population genomic analyses. We are currently working to refine our understanding of genomic variation segregating within this species and are particularly interested in the evolutionary genetics of the symbiosis between Medicago and Sinorhizobia. The successful applicant will have considerable freedom to develop research in their area of interest.

The deadline for submissions is 15 September 2013, so get in touch with Peter pronto if you’re interested. (See the full ad for contact information and the application package requirements—it’s standard stuff.) Benefits of the position include working with population genomic data from the cutting edge of current technology in a collegial lab with some very smart people (and me) in the midst of a fantastic community of biologists at the University of Minnesota—as well as living in the Twin Cities, which are empirically awesome. Yes, even in winter.◼

The Molecular Ecologist: Domesticated genes answer the call of the wild

Soay sheep on Hirta, St Kilda, with Cleits Wild Soay sheep, in an assortment of colors. Photo by Commonorgarden.

This week at the Molecular Ecologist, I’m discussing a new study from the blog’s parent publication, Molecular Ecology, which traces the origins of gene variants in a wild population of Soay sheep … back to domestic sheep.

The Soay sheep haven’t been completely isolated from other breeds. In recent centuries, they shared the Saint Kilda islands with humans, who kept domesticated sheep—providing several hundred years of opportunity for what geneticists call “an admixture event,” and everyone else calls “sex,” between the Soay breed and those domesticated sheep.

To learn how the study’s authors pinpointed the origin of the domestic genes variants, and how those variants have fared in the wild sheep, go read the whole thing.◼

The Molecular Ecologist: Storing and protecting your NGS data

Data Barnacles. Photo by UWW ResNet.

This week at The Molecular Ecologist, Mark Christie shares some tips for how to take care of that massive genetic dataset that’s just come off the high-throughput sequencer:

Congratulations! You have recently received a file path to retrieve your hard-earned next-generation sequencing data. You quickly transfer the files to the computing cluster you work on or perhaps, if you only have a few lanes of data, to your own computer. But before you begin messing around with your data, you quickly realize that you should come up with a plan to back up and store unadulterated versions of your files.

For a nice set of recommendations with some step-by-step instructions, go read the whole thing.◼

New, rigorous study looks for genes associated with education—but doesn’t find much

classroom Genetics may impact how long you stay in school—by a factor of a month or so. Photo by velkr0.

Late update: Michelle Meyer, who sits on the advisory board of the consortium responsible for the study discussed below, briefly discusses the results on her blog, and links to a Frequently Asked Questions document [PDF] meant to accompany the study, which makes some reasonable and sensible points about how best to understand the findings. A point I didn’t emphasize originally is that the small effect size of the sites identified suggests that a lot of previous “sociological genetics” studies are now called into question—because their sample sizes were far too small to detect such subtle effects.

A few months ago, I roundly thrashed a study that attempted to identify genes associated with educational achievement. It was, to put it mildly, shooting fish in a barrel: that paper was published in a journal that doesn’t handle much (if any) genetics research, the sample size was small, the genetic data was sparse, the analysis applied to the genetic data didn’t test for what the authors wanted to test for, and the authors ignored basic statistical practice when they interpreted the results.

This week, though, there’s a new study of the genetic basis for educational achievement that is the mirror-image opposite of the one I beat up: it’s online ahead of print in Science, it has a great big sample size of 101,069 participants and a built-in “replication” sample of 25,490 more, it works with good genome-wide genetic data, and it looks to be both admirably careful in its statistical work and cautious in its conclusions—which is consistent with the inclusion, in the paper’s lengthy author list, of some folks who know what they’re talking about when it comes to association genetics.

So, naturally, I wanted to write something about this study as a nice example of what’s possible when genetic analysis is done right. Unfortunately, the actual results of the study don’t give me much to discuss—because, for all its rigor and caution, it doesn’t find much in the way of genetic explanation for educational achievement.

First, a little more explanation of the work itself. The authors clearly note that they’re not looking for gene variants that cause people to go to college—they’re looking for gene variants associated with increased educational achievement, which might actually be related to some sort of underlying cognitive ability. Educational achievement is simply a convenient proxy for that unknown capacity, because it’s relatively standardized across modern nations. So the authors rounded up data from almost 130,000 people who have volunteered to be genotyped at millions of loci, and who had indicated (1) how many years of education they’d completed and (2) whether or not they completed a college degree.

For each of those education-related measures, the authors conducted a fairly standard genome-wide association (GWA) analysis—asking, for every genetic marker in the dataset, whether people with one version of the marker went to school for longer, or were more likely to complete college, than people with the other version of the marker. The idea is that when people with different versions of a genetic marker differ especially strongly in a particular measurement, that marker probably lies in region of genetic code that contributes to the value of that measurement. Good statistical practice—which the authors followed—requires that you set the threshold of “especially strongly” higher as you test more markers, and that you validate the markers you find in a first association analysis by conducting a second, independent analysis with a different sample of test subjects to see if the same markers turn up again.

But this big, careful study didn’t find all that much. A handful of markers passed the GWA search critera—three with “genome-wide significant” effects and another seven with “suggestive” effects. None of these markers were associated with large differences in educational attainment—a couple months more time in school or a slightly different chance of completing college. And when the authors looked at the collective effects of all the markers that were associated even weakly with differences in education, they found they only explained about 2% of the variation in the number of years of education attained; or 3% of variation in college completion.

Magnified (8/365) Statistically significant effects—but vanishingly small ones. Photo by jakebouma.

For comparison, the authors note that estimates based on studies of twins or other close relatives have found that genetic relatedness accounts for up to 40% of variation in educational achievement. That’s either a lot of missing heritability, or an indication that the relatedness-based studies are grossly overestimating genetic effects.

The authors conclude that “For complex social-science phenotypes that are likely to have a genetic architecture similar to educational attainment, our estimate of [an effect size of] 0.02% [per candidate marker] can serve as a benchmark for conducting power analyses and evaluating the plausibility of existing findings in the literature.” That’s a slightly roundabout way of saying that future attempts to identify gene regions contributing to educational achievement or other intelligence-related traits will need to have sample sizes big enough to deal with teeny tiny effects.

What I take away from this work is that, in the end, non-genetic effects—parents’ income, local school quality, nutrition, culutral expectations, you name it—are much more important than genetics. I have to say, I don’t think that’s especially surprising, but it’s always nice to see data that backs up one’s own expectations.

And that leads into my final thought about this paper: for all the caution and rigor that went into the analysis, what do the authors expect folks to do with the results? Say that they had, indeed, found some gene regions that explain a substantial fraction of variation in educational achievement. What, exactly, is the application for such knowledge? Genetic testing of college applicants? Screening embryos for favorable gene variants? Drugs targeted to the proteins produced by the candidate genes? (But then, we already have drugs that enhance cognitive performance, like Ritalin or my personal favorite, orally-administered infusions of caffeine.)

I don’t raise these questions because I wish that this study hadn’t been conducted—I believe knowledge is important for its own sake. But it’s impossible to contemplate this kind of research without thinking of its Gattaca-like implications. And in that sense, the weak results of the study are something of a relief. I’d personally much rather live in a world where we spend education budgets on actually educating students, instead of testing them for gene variants that might predict how well they’ll do in school.◼


Rietveld C.A., Medland S.E., Derringer J., Yang J., Esko T., Martin N.W., Westra H.J., Shakhbazov K., Abdellaoui A. & Agrawal A. et al. GWAS of 126,559 individuals identifies genetic variants associated with educational attainment, Science, DOI:

False discovery: How not to find the genetic basis of human intelligence

classroom Does a new study really identify genes that determine whether you’ll go to college? Um, no. Photo by velkr0.

Identifying a genetic basis for human intelligence is fraught with huge ethical, social, and political implications. If we knew of gene variants that increased intelligence, would we try to engineer them into our children? Or use them to determine who gets college loans? Or maybe just discourage people carrying the wrong variant from having children? So you’d think that researchers working on that topic would proceed with extra caution, and make sure their conclusions were absolutely iron-clad before submitting results for publication in a scientfic journal—and that peer reviewers working for journals in that field would examine the work that much more closely before agreeing to publication.

Yeah, well, if you thought that, you would be wrong.

A paper just published online ahead of print at the journal Culture and Brain claims to have identified genetic markers that (1) differentiate college students from the general population and (2) are significantly associated with cognitive and behavioral traits. Cool, right? That would mean that these marker identify genes that determine whether you make it to college, and how well you do in educational settings generally—they’re genes that contribute to intelligence.

Again, if you thought that, you’d be wrong. But in that wrongness, you’re in good company, alongside the authors of this paper and, apparently, everyone involved in its peer review and publication.

Out of equilibrium

Here’s what the paper’s authors did to identify these “intelligence” genes. They recruited almost 500 students at Beijing Normal University, took blood samples from them, and gave them all a series of 49 different cognitive and behavioral tests, covering problem solving, memory, language and mathematical ability, and a bunch of other things we generally think of as having to do with intelligence. Using the blood samples, the authors genotyped all of the students at 284 single-nucleotide polymorphism (SNP) markers located in genes with expected connections to brain function—either because they’re involved in producing neurotransmitters, or they’re strongly expressed in the brain.

Next, the authors tested each of the 284 SNPs for deviation from Hardy-Weinberg Equilibrium, or HWE. If you’re not familiar with the concept, here’s my attempt at a brief explanation: HWE boils down to probability.

We all carry two complete sets of genes—one from Dad, one from Mom. So, suppose there’s a spot in the genome where two possible variants—let’s call them A and T—can occur. This is exactly what a SNP is, a single letter of DNA code that differs from person to person. Taking into account the two copies of eaach gene we carry, every person can have one of three possible diploid genotypes at that single-letter spot: AA, AT, or TT.

If we know how common As and Ts are in the population as a whole, we can estimate how common those three diploid genotypes should be: the frequency of the first allele times the frequency of the second allele. Say you’ve genotyped a sample of people, and you find that 40% of the markers are As (a frequency of 0.4), and 60% are Ts (frequency of 0.6). Then, if the two variants are distributed randomly among all the people you’ve sampled, you’d expect to find 16% (0.4 × 0.4 = 0.16) AA genotypes, 36% (0.6 × 0.6 = 0.36) TT genotypes, and 48% either AT or TA genotypes (0.4 × 0.6 + 0.6 × 0.4 = 0.48).

If the actual frequencies of the three genotypes are close to that expectation, we say the SNP is in Hardy-Weinberg equilibrium, a state named for the two guys who originally deduced all this. Deviations from HWE may occur if, for some reason, people are more likely to mate with people who carry the same genotype, or if the three possible genotypes are associated with having different numbers of children—different fitness, in the evolutionary sense. So a deviation from HWE may mean something is going on at the deviating spot in the genome.

Of the 284 SNPs, the authors identified 24 with genotype frequecies that show a statistically significant deviation from HWE—in their sample of college students, that is. They also examined HWE for the same SNPs in a sample taken from the general population of Beijing, as part of the 1000 Genomes database of human genetic diversity, and found that all but 2 of the 24 SNPs that violated HWE in the students were within HWE expectations in the comparison sample. They conclude that this means that something about these 24 SNPs sets the college students apart from the broader population of Beijing.

Except this is not how population geneticists calculate genetic differentiation between two groups of people. For that, we usually use a statistic called FST, which essentially calculates the degree to which allele frequencies differ between two groups. That is, if the students are really differentiated from the rest of Beijing at a particular SNP, then we’d expect the frequency of the A allele among the students to be really different from the frequency of A in the other sample. FST is related to deviation from HWE; but it’s not at all the same thing. Fortunately for us all, the authors published all their genotype frequency data as Tables 1 and 2 of the paper. I can check directly to see whether the FST at each locus suggests meaningful genetic differentiation between the students and the comparison sample.

Chen&al2013_FstThe distribution of FST values calculated from the 24 SNPs. Image by jby.

Possible values for FST range from 0, when there is no difference between the two groups being compared; and 1, when the two groups are completely differentiated. The FST values I calculated from the data tables range from 0.00003 to 0.05432, and half of them are less than 0.002—that’s within the range seen for any random sample of genetic markers in other human populations [PDF]. Which is to say, the 24 SNPs identified in this paper are not really that differentiated at all.

Uncorrected testing is un-correct

But these markers identified in the study are still associated with congnitive ability, right? Well, brace yourself: there are serious problems with that claim, too. To test for association with cognition, the authors conducted a statistical test asking whether students with each of the three possible genotypes at a given SNP differed in the scores they got on the different cognitive tests. If the difference among genotypes was greater than expected by chance, they concluded that the SNP was associated with the element of intelligence approximated by that particular cognitive test. They identified these “significant” associations using a p-value cutoff of 0.01, which is a technical way of saying that the probability of observing the difference among genotypes simply by chance is less than 1 in 100.

The authors tested for associations of the genotypes at 19 SNPs (excluding 5 that would’ve had too few people with one or more of the three genotypes) with all 49 cognitive tests. They conducted each test using the complete sample of students, and then also the males and females separately, in case there were gender differences in the effects of each SNP. Across all three data sets (total, male, and female), they found 17 significant associations.

Statisticians and regular readers of xkcd will probably already know where this is going.

If you conduct one statistical test using a particular dataset, and see that there’s a 1 in 100 chance of observing the result purely by chance, you can be reasonably sure (99% sure!) that your result isn’t due to chance. However, if you conduct 100 such tests, and only one of them has a p-value of 0.01, then that is quite possibly the one time in 100 the result is pure coincidence. Think of it this way: it’s a safe bet that one roll of a die won’t be a six; but it’s not such a safe bet that if you roll a die six times, you won’t roll a six at least once. In statistics, this is called a multiple testing (or multiple comparisons) problem.

How many tests did the authors conduct? That would be 49 cognitive measurements × 19 SNPs, or 931 tests on each of the three separate datasets. At p = 0.01, you’d expect them to get somewhat more than 9 “significant” results that aren’t actually significant. And, indeed, for the total datset, they found 7 significant results; for the male students alone, they found 3; and for the females, 7. That’s exactly what would happen if there were no true associations between the SNP genotypes and the cognitive test results at all.

And, to go all the way back to the beginning, what was the p-value cutoff for the authors’ test of HWE? They considered deviations from HWE significant if the probability of observing the deviation by chance was less than 5%, or p ≤ 0.05. And 5% of 284 SNPs is a bit more than 14. That’s a pretty big chunk of their 24-SNP list.

In short, the authors of this paper identified a list of SNPs that supposedly differentiate college students from the general population, using a method that doesn’t actually identify differentiated SNPs. They then conducted a series of tests for association between those SNPs and intelligence-related traits, and didn’t find any more association than expected purely by chance. The list of genes identified this way is literally no better than what you’d get using two spins of a random number generator.

Who cares about methodological correctness, anyway?

What really makes me angry about this paper, though, is this: there are ways to do it right. The authors could have talked to a population geneticist, who would have told them to use FST or a similar measure of genetic differentiation. They could have used any number of methods to correct for the multiple testing problem in their final test for associations. And, in fact, someone must have pointed that second one out to them, because here’s what they write in the final paragraph of the paper:

… we analyzed all significant main effects at the P ≤ 0.01 level, without using more stringent corrections for multiple comparisons. We deemed this as an exploratory study to see if there were any behavioral or cognitive correlates of the SNPs in HWD. These results should provide bases for future confirmatory hypothesis-testing research.

In other words, they’re just fishing around for genes, here, so why should they actually perform a statistically rigorous test? But precisely because they don’t correct for multiple testing, any money spent on “future confirmatory hypothesis-testing research” would be wasted—it might as well start with a random selection of SNPs from the original list the authors chose to examine.

Given the nature of its subject matter, it’s appalling to me that this paper made it through peer review and into a scientific journal. It certainly wouldn’t have made it into a journal whose editors and reviewers understood basic population genetics. If I had to guess, I’d speculate that Culture and Brain doesn’t have any geneticists in its reviewer rolls—the fact that the authors spend a large chunk of their Introduction simply explaining Hardy-Weinberg Equilibrium suggests that their audience is people who don’t know much about the kind of data being presented.

And that’s where we come to the real lesson of this study. It’s getting cheaper and easier to collect genetic data with every passing day—to the point that researchers with no prior expertise or experience with genetic data can now do it. I’m afraid we’re going to see a lot more papers like this one, in the years to come.◼


Chen C., Chen C., Moyzis R.K., He Q., Lei X., Li J., Zhu B., Xue G. & Dong Q. Genotypes over-represented among college students are linked to better cognitive abilities and socioemotional adjustment, Culture and Brain, DOI:

Clark A.G., Nielsen R., Signorovitch J., Matise T.C., Glanowski S., Heil J., Winn-Deen E.S., Holden A.L. & Lai E. (2003). Linkage disequilibrium and inference of ancestral recombination in 538 single-nucleotide polymorphism clusters across the human genome, The American Journal of Human Genetics, 73 (2) 285-300. DOI:

The Molecular Ecology Online Forum

Remember the Molecular Ecologist symposium I attended as part of the 2012 Evolution meetings in Ottawa? Well, there’s going to be a sequel, launching Wednesday in convenient online format.

The Molecular Ecologist will be hosting speakers from the Ottawa symposium in a live-chat on the blog, starting at 9 a.m. US Central Time and running until noon (that’s 3-6 p.m. GMT, for those of us located outside North American). We’re trying out a live-chat service called CoverItLive, which will let readers follow the coversation and submit questions and/or comments directly from the blog — test runs have gone pretty smoothly, and I’m excited to see how this works as a medium for scientific discussion.

If you want to review the Ottawa symposium beforehand, check out the archived material at the Molecular Ecology websited. To indicate your interest and submit questions in advance, e-mail Molecular Ecology Managing Editor Tim Vines; otherwise, just join us Wednesday morning at The Molecular Ecologist.◼

Many genes, but two major roads to adaptation

Fruit fly (Drosophila melanogaster, male) Drosophila melanogaster. Photo by Max xx.

Cross-posted at Nothing in Biology Makes Sense!

In the course of adaptive evolution — evolutionary change via natural selection — gene variants that increase the odds of survival and reproduction become more common in a population as a whole. When we’re only talking about a single gene variant with a strong beneficial effect, that makes for a pretty simple picture: the beneficial variant becomes more and more common with each generation, until everyone in the population carries it, and it’s “fixed.” But when many genes are involved in adaptation, the picture isn’t so simple.

This is because the more genes there are contributing to a trait, the more the trait behaves like a quantitative, not a Mendelian, feature. That is, instead of being a simple question of whether or not an individual has the more useful variant, or allele, at a single gene — like a light switch turned on or off — it becomes possible to add up to the same trait value with different combinations of variants at completely different genes. As a result, advantageous alleles may never become completely fixed in the course of an adaptive evolutionary response to, say, changing environmental conditions.

That principle is uniquely well illustrated by a paper published in the most recent issue of Molecular Ecology, which pairs classic experimental evolution of the fruitfly Drosophila melanogaster with modern high-throughput sequencing to directly observe changes in gene variant frequencies during the course of adaptive evolution. It clearly demonstrates that when many genes contribute to adaptation, fixation is no longer inevitable, or even necessary.

Turning up the heat, homogenizing flies

The authors of the new study, a team from the Institut für Populationsgenetik led by Pablo Orozco-terWengel, conducted what would otherwise be a rather simple experiment in evolutionary change in the laboratory. Starting with fruitflies collected from a wild population in Portugal (yes, Virginia, Drosophila melanogaster has wild populations!) they established three replicate populations of about 1,000 flies, which they put in temperature-controlled conditions somewhat warmer than the original collection location, and allowed them to propagate for 37 generations. Exensive previous work with Drosophila has established that simply moving the flies into a laboratory setting — where they live in bottles, and eat prepared food — exerts natural selection on them, and the increased temperature added a little bit more novelty to the lab environment to make it more likely adaptation would occur.

This experiment is different from all that previous experimental evolution of Drosophila, though, is that the coauthors tracked allele frequencies at thousands of markers during the course of those 37 generations of adaptation to the lab. To do this efficiently, they used an approach called “pooled sequencing.”

The principle behind pooled sequencing is that, if all you care about is the relative frequency of a gene variant in a whole population, you don’t need to know the genotype of any specific individual in that population. So to track changes in allele frequency, the team sampled hundreds of flies from the experimental population, and ground them all up together. (The polite, technical term used here is “homogenized.”) They then extracted DNA from this “pooled” sample, and used a high-throughput sequencer to collect millions of reads — short snippets of DNA sequence — out of the pool as a whole.

To extract allele frequencies from all of those sequence reads, the team identified where each read matched the Drosophila melanogaster reference genome. When multiple reads matched to the same location, but differed in one or more DNA nucleotide bases, they identified those bases as variable markers — single-nucleotide polymorphisms, or SNPs. Because the original DNA sample was pooled from many mashed-together flies, the relative frequency of each different variant of a SNP in the Illumina output should reflect the relative frequency of that SNP variant in the population as a whole.

Using this approach, Orozco-terWengel et al. could track allele frequency changes across more than a million SNP markers by taking these pooled samples from the intial population of flies, then at multiple points during the 37-generation evolutionary experiment. By comparing the allele frequencies in samples taken during the course of adaptation to the allele frequencies in the sample from the starting population, they could identify SNPs that became more common as the population adapted — and, because they had a big sample from across the genome, they could identify those SNPs whose allele frequencies had changed more than would be expected due to genetic drift. They examined samples taken after 15 and 27 generations of evolution, and at the end of the 37-generation experiment.

Two paths to adaptation

Fruit fly (Drosophila melanogaster, male) Allele frequency changes (AFC) in SNPs showing significant change by generation 15 (a) and by generation 37 (b). Image from Orozco-terWengel et al. (2012), figure 3.

What they found was largely in line with the verbal model I outlined at the beginning of this post. Over the course of experimental evolution, significant increases in allele frequency occurred at thousands of SNPs — suggesting that a great many genes are involved in the process of adaptation to life in the lab. Accordingly, very few of those allele frequency changes (in about 0.5% of the 2,000 SNPs that showed the greatest change from start to finish) represented complete or near-complete fixation.

More interestingly, comparison of allele frequency changes at the 15th generation and at the end of the experiment revealed two major “paths” taken by alleles. In the first case, the SNPs with strongest allele frequency changes by generation 15 all hit a “plateau” in subsequent generations — they didn’t see any significant increase in frequency between generations 15 and 37. In the second case, SNPs with the strongest allele frequency changes by generation 37, the end of the experiment, had increased steadily from the beginning population through the samples taken at the 15th and 27th generation. The SNPs in this second set had not shown significant allele frequency increases by generation 15 — which means the SNPs underlying most of the adaptive change in the first half of the experiment were a completely different set than the SNPs underlying adaptive change in subsequent generations.

If it’s already adapted, don’t fix it.

On the one hand, that suggests that Orozco-terWengel et al. managed to capture SNPs with a range of different contributions to the adaptation the observed by the end of the experiment. The SNPs with the biggest contribution showed rapid initial increases in allele frequency, then leveled off; SNPs with weaker effects showed slower, steady increases that continued for the entire experiment. But if it’s that simple, why didn’t the large-effect SNPs show continuing allele frequency change after the midpoint of the experiment?

It may be, as the coauthors speculate, that the two classes of SNPs identified in their experiment are separated by more than just the size of their respective contributions to adaptive change. There could be interactions among the alleles at these SNPs, such as overdominance, in which an individual is most fit when he or she carries two different alleles at a locus, rather than two copies of either allele. Overdominance would explain why most of the SNPs showing rapid initial increases in allele frequency then leveled out at intermediate frequencies.

So this combination of experimental evolution and modern sequencing technology raises some interesting questions even as it supports a lot of previous thinking about how natural selection acts on traits that are created by the collective action of many genes. It’s an exciting result, and, I hope, inspiration for much more work digging into the details of such “polygenic” adaptation.◼


Burke, M. and A. Long. 2012. What paths do advantageous alleles take during short-term evolutionary change? Molecular Ecology 4913–4916. DOI: 10.1111/j.1365-294X.2012.05745.x.

Orozco-Terwengel, P., M. Kapun, V. Nolte, R. Kofler, T. Flatt and C. Schlötterer. 2012. Adaptation of Drosophila to a novel laboratory environment reveals temporally heterogeneous trajectories of selected alleles. Molecular Ecology 4931–4941. 10.1111/j.1365-294X.2012.05673.x.

Pavlidis, P., D. Metzler and W. Stephan. 2012. Selective sweeps in multi-locus models of quantitative traits. Genetics 192:225–239. DOI: 10.1534/genetics.112.142547.

Notes on a model species

2012.09.20 - Seeds

Seeds. (Flickr: jby)

There are a couple of neat racks on my desk containing rows of plastic tubes, each tube with a drift of tiny, kidney-bean-shaped seeds at the bottom. These are seeds of Medicago truncatula, barrel medick. When I tell people about this plant I’m currently studying, I usually describe it as an unremarkable wildflower native to the Mediterranean. Or I note that it’s a close-ish relative of alfalfa (Medicago sativa).

Medicago truncatula does not have an especially grand heritage. It grows in dry, sunny places throughout the dry, sunny Mediterranean region, forming low tangles of trifoliate leaves and small yellow flowers that eventually ripen into tough, spiky, vaguely barrel-shaped fruits full of those tiny seeds. Some of the seeds on my desk are descended from plants that grew in places like the Temple of Apollo at Curium, Cyprus; but most are from less distinguished locales. In his 2011 monograph on the genus Medicago, Ernest Small quotes a description of M. truncatula‘s habitat as “sandy fields, wet grasslands, wet meadows, strongly overgrazed and degraded garrique, coniferous forests, grasslands, fallow fields, olive groves, and as a weed in cereal and crops and waste places.”

Continue reading

Nothing in Biology Makes Sense: Searching for Ronald Fisher

Geneticist Ronald A. Fisher. Photo via WikiMedia Commons.

This week at the collaborative science blog Nothing in Biology Makes Sense!, my lab-mate John Stanton-Geddes writes about the current state of evolutionary genetics, as presented at the recent Evolution meetings in Ottawa:

One theme that emerged through the meeting was “The genetic basis for [insert trait here]. While this goal of mapping phenotype to genotype has been a primary goal of many evolutionary ecologists since the first QTL mapping studies, it has recently come under strong criticism, notably in a fantastic paper by Matthew Rockman in the journal Evolution last year, but also by Pritchard and Di Rienzo 2010 and in a forthcoming article by Ruth Shaw (full disclosure: Ruth was my PhD advisor) and Mike Travisano.

Readers of Denim and Tweed will recognize that John’s complaint about our ongoing fixation (ha!) on individual genes of large effect mirrors some of my own recent thinking. So naturally, I think you should go read the whole thing.◼

The living rainbow: For the selective benefits of being gay, count your cousins

Photo source unknown, presumed public domain.

ResearchBlogging.orgThere’s some more new evidence for one of the theories as to how gene variants that make men more likely to be gay could persist in human populations in the face of their obvious selective disadvantages: the same genes could, when carried by women, lead to greater fertility.

I recently posted about a study of Samoan fa’afafine, that documented this effect; now an Italian team is reporting, in a forthcoming article in The Journal of Sexual Medecine, that they’ve found the same thing in a sample of 200-some French and Italian women [$a].

The authors interviewed women who were the biological mothers or aunts of gay men, and compared them to women who were mothers or aunts of straight men. They gave each participant a questionaire covering the key question—how many children they’d had. It also covered a sort of focused medical history, covering a slew of conditions that might have affected their fertility—anything from chlamydia infections to ovarian cysts to complicated pregnancies—and asked about their sexual behavior and history. Finally, the team gave the women in their sample a standardized personality test.

Even this relatively small sample showed the previously documented effect of shared genetics with gay men—women who had gay sons or nephews had more children than those who didn’t. Mothers and aunts of gay men also reported lower rates of medical conditions that could reduce their ability to have children. They said they’d had more partners than mothers and aunts of straight men (but this difference wasn’t statistically significant) and were also less concerned about family issues, and more likely to have been divorced. Finally, the personality test revealed that mothers and aunts of gay men were more extraverted.

That’s a big pile of factors tested, which makes me wonder about multiple testing issues with a small sample size. The study’s authors build a somewhat complicated narrative out of it all: They speculate that the same genes that make men gay make women less likely to have fertility-reducing conditions, but also more extraverted and more “relaxed” about building a family—which apparently also helps them have more children. So, okay, I guess that’s plausible given the results.

Here’s what the study doesn’t do, however: it doesn’t identify any specific genes involved in making gay men gay. It can’t actually test the hypothesis that there’s a genetic basis to same-sex attraction at all, much less the hypothesis that genes promoting same-sex attraction in men are located on the maternally-inherited X-chromosome. For those questions, you really need full pedigree data—or, better yet, lots and lots of genetic data; interviewing only female relatives isn’t remotely enough.

The text of the article doesn’t necessarily make that point as clearly as it could. The authors spend a great deal of time talking about the X-chromosome hypothesis, and though they make the requisite disclaimer in the Conclusions section—

With this type of limited data, we cannot directly derive a causal connection between the hypothetical sexually antagonistic autosomal or X-chromosome-linked genetic factors and health, behavior, and personality.

—that disclaimer elides the point that their data set can’t really test anything to do with genetics indirectly either.

The authors repeatedly describe their sample as a “pilot study,” however, so maybe something bigger, and more rigorous, is in the works.◼


Camperio Ciani, A., Fontanesi, L., Iemmola, F., Giannella, E., Ferron, C., & Lombardi, L. (2012). Factors associated with higher fecundity in female maternal relatives of homosexual men. The Journal of Sexual Medicine DOI: 10.1111/j.1743-6109.2012.02785.x