Late update: Michelle Meyer, who sits on the advisory board of the consortium responsible for the study discussed below, briefly discusses the results on her blog, and links to a Frequently Asked Questions document [PDF] meant to accompany the study, which makes some reasonable and sensible points about how best to understand the findings. A point I didn’t emphasize originally is that the small effect size of the sites identified suggests that a lot of previous “sociological genetics” studies are now called into question—because their sample sizes were far too small to detect such subtle effects.
A few months ago, I roundly thrashed a study that attempted to identify genes associated with educational achievement. It was, to put it mildly, shooting fish in a barrel: that paper was published in a journal that doesn’t handle much (if any) genetics research, the sample size was small, the genetic data was sparse, the analysis applied to the genetic data didn’t test for what the authors wanted to test for, and the authors ignored basic statistical practice when they interpreted the results.
This week, though, there’s a new study of the genetic basis for educational achievement that is the mirror-image opposite of the one I beat up: it’s online ahead of print in Science, it has a great big sample size of 101,069 participants and a built-in “replication” sample of 25,490 more, it works with good genome-wide genetic data, and it looks to be both admirably careful in its statistical work and cautious in its conclusions—which is consistent with the inclusion, in the paper’s lengthy author list, of some folks who know what they’re talking about when it comes to association genetics.
So, naturally, I wanted to write something about this study as a nice example of what’s possible when genetic analysis is done right. Unfortunately, the actual results of the study don’t give me much to discuss—because, for all its rigor and caution, it doesn’t find much in the way of genetic explanation for educational achievement.
First, a little more explanation of the work itself. The authors clearly note that they’re not looking for gene variants that cause people to go to college—they’re looking for gene variants associated with increased educational achievement, which might actually be related to some sort of underlying cognitive ability. Educational achievement is simply a convenient proxy for that unknown capacity, because it’s relatively standardized across modern nations. So the authors rounded up data from almost 130,000 people who have volunteered to be genotyped at millions of loci, and who had indicated (1) how many years of education they’d completed and (2) whether or not they completed a college degree.
For each of those education-related measures, the authors conducted a fairly standard genome-wide association (GWA) analysis—asking, for every genetic marker in the dataset, whether people with one version of the marker went to school for longer, or were more likely to complete college, than people with the other version of the marker. The idea is that when people with different versions of a genetic marker differ especially strongly in a particular measurement, that marker probably lies in region of genetic code that contributes to the value of that measurement. Good statistical practice—which the authors followed—requires that you set the threshold of “especially strongly” higher as you test more markers, and that you validate the markers you find in a first association analysis by conducting a second, independent analysis with a different sample of test subjects to see if the same markers turn up again.
But this big, careful study didn’t find all that much. A handful of markers passed the GWA search critera—three with “genome-wide significant” effects and another seven with “suggestive” effects. None of these markers were associated with large differences in educational attainment—a couple months more time in school or a slightly different chance of completing college. And when the authors looked at the collective effects of all the markers that were associated even weakly with differences in education, they found they only explained about 2% of the variation in the number of years of education attained; or 3% of variation in college completion.
For comparison, the authors note that estimates based on studies of twins or other close relatives have found that genetic relatedness accounts for up to 40% of variation in educational achievement. That’s either a lot of missing heritability, or an indication that the relatedness-based studies are grossly overestimating genetic effects.
The authors conclude that “For complex social-science phenotypes that are likely to have a genetic architecture similar to educational attainment, our estimate of [an effect size of] 0.02% [per candidate marker] can serve as a benchmark for conducting power analyses and evaluating the plausibility of existing findings in the literature.” That’s a slightly roundabout way of saying that future attempts to identify gene regions contributing to educational achievement or other intelligence-related traits will need to have sample sizes big enough to deal with teeny tiny effects.
What I take away from this work is that, in the end, non-genetic effects—parents’ income, local school quality, nutrition, culutral expectations, you name it—are much more important than genetics. I have to say, I don’t think that’s especially surprising, but it’s always nice to see data that backs up one’s own expectations.
And that leads into my final thought about this paper: for all the caution and rigor that went into the analysis, what do the authors expect folks to do with the results? Say that they had, indeed, found some gene regions that explain a substantial fraction of variation in educational achievement. What, exactly, is the application for such knowledge? Genetic testing of college applicants? Screening embryos for favorable gene variants? Drugs targeted to the proteins produced by the candidate genes? (But then, we already have drugs that enhance cognitive performance, like Ritalin or my personal favorite, orally-administered infusions of caffeine.)
I don’t raise these questions because I wish that this study hadn’t been conducted—I believe knowledge is important for its own sake. But it’s impossible to contemplate this kind of research without thinking of its Gattaca-like implications. And in that sense, the weak results of the study are something of a relief. I’d personally much rather live in a world where we spend education budgets on actually educating students, instead of testing them for gene variants that might predict how well they’ll do in school.◼
Rietveld C.A., Medland S.E., Derringer J., Yang J., Esko T., Martin N.W., Westra H.J., Shakhbazov K., Abdellaoui A. & Agrawal A. et al. GWAS of 126,559 individuals identifies genetic variants associated with educational attainment, Science, DOI: 10.1126/science.1235488