Utopian lepidoptery

This month’s issue of Wired reports on DNA barcoding, with extensive interviews of barcoding masterminds Dan Janzen and Paul Hebert. In spite of myself, I’m charmed by the article’s description of Janzen as a “utopian lepidopterist.”

Photo by fabbio.

I’ve posted about one recent Janzen-Hebert barcoding paper, and about a subsequently-released study that suggests a major problem for the usefulness of the preferred “barcode” gene, COI. I’d say the Wired coverage is actually pretty OK for a popular treatment – it acknowledges criticism of barcoding, even though, in typical Wired fashion, the piece is obviously most interested in the whiz-bang ideas like a “species tricorder” handheld device for field I.D. of organisms. Good geek that I am, I would have liked to see more discussion of the actual technical issues – what about the difficulties of using mitochondrial DNA for plant I.D.?

Update: No one is proposing using mitochondrial DNA for barcoding plants. Which is good, because it would be silly – DNA in the plant mitochondrion mutates extremely slowly, so it doesn’t build up much difference between closely-related species. Instead, Kress and coauthors proposed using both a nuclear gene and a segment of chloroplast DNA in a 2005 paper.

DNA barcoding: A glitch in the system?

ResearchBlogging.orgFollowing up on last week’s post about uncovering hidden species using DNA diversity (or “DNA barcoding”), an open-access paper in this week’s issue of PNAS demonstrates a potentially significant glitch in the system: mitochondrial pseudogenes.

The original DNA barcoding concept is straightforward, if not uncontroversial – use a standard DNA sequence marker to identify (“barcode”) species that might be challenging to ID otherwise, or previously not known as separate species. The proposed standard marker is a mitochondrial gene that codes for the protein cytochrome oxidase I (COI), which varies quite a bit between animal species (though it wouldn’t work for plants, whose mitochondrial DNA mutates very rarely). The lab where I work has used COI for a lot of studies in yucca moths, though not barcoding per se.

Photo by fabbio.

One potential problem with barcoding is that sequencing any gene in one species using procedures derived from another species is always a bit risky. DNA sequencing relies on primers, short snippets of DNA that bind to a region near the target gene as part of the reaction that makes lots of copies of that gene for analysis (this is called PCR, for polymerase chain reaction). The easiest way to get sequence data for a new species is to try and use primers from a close relative – if there aren’t any mutations at the primer site, they should carry over. But mutation happens, and it can definitely happen at primer sites.

Primer site mutations are a minor problem compared to pseudogenes, the focus of the new paper by Song et al. Pseudogenes are a result of gene duplication, a mutation in which an extra copy of a gene is accidentally created during DNA replication. Because it’s redundant, the extra copy can absorb mutations that destroy its function without harming individuals who carry it. The duplicate is then “junk DNA,” free to accumulate mutations – a pseudogene. (Gene duplication is also one way that new proteins and gene functions can evolve – but that’s beyond the scope of the present post.) A primer site mutation just means that primers from one species won’t work on another, but a pseudogene might still bind to primers. And then you can get sequence data from the pseudogene instead of the target gene.

DNA barcoding identifies species based on how many mutations have accumulated since they split from a common ancestor; a pseudogene, which mutates faster, can make two samples look further apart then than they are. So barcoding studies that accidentally use pseudogenes may identify two species where only one exists. Song et al. use data on mitochondrial pseudogenes in insects and crustaceans to argue that pseudogenes are both common and unpredictable. They also perform barcoding on grasshoppers and crustaceans using data “contaminated” with pseudogenes and data without – unsurprisingly, pseudogenes inflated the number of species detected by barcoding. Although Song et al. suggest a few ways to reduce the odds of interference from pseudogenes, they conclude that there is no way to completely eliminate this problem.

Last week’s paper by Smith and colleagues showed the importance of species identification for conservationists, ecologists, and evolutionary biologists. This new result suggests that DNA barcoding may not be the best way to identify species.


P.D.N. Hebert, A. Cywinska, S.L. Ball, J.R. deWaard (2003). Biological identifications through DNA barcodes Proc. Royal Society B, 270 (1512), 313-21 DOI: 10.1098/rspb.2002.2218

H. Song, J.E. Buhay, M.F. Whiting, K.A. Crandall (2008). Many species in one: DNA barcoding overestimates the number of species when nuclear mitochondrial pseudogenes are coamplified PNAS, 105 (36), 13486-91 DOI: 10.1073/pnas.0803076105

Species hiding in plain sight

ResearchBlogging.orgIt’s a truism that biologists have cataloged only a fraction of the living things on Earth. This is a major problem for conservationists, ecologists, and evolutionary biologists, because many of the questions we want to answer (“Which parcel of rain forest should we preserve?” or “How do species interactions play out over millions of years?”) hinge how we count species.

One solution is DNA barcoding, which uses the evolutionary divergence encoded in DNA sequences to tell species apart [$-a]. Barcoding is supposed to help researchers identify species without being experts in the fiddly business of taxonomy based on physical traits. It can also differentiate species that might never be recognized as separate without a DNA analysis.

Caterpillar with braconid pupae
Photo by Anita Gould.

An open-access article in last week’s PNAS does exactly that for a group of wasps in the family Braconidae. Braconid wasps are parasitoids, laying their eggs in live hosts. Eventually the eggs hatch and the larvae eat their host alive, then emerge to form pupae like those on the Hog Sphinx Moth caterpillar in the photo. (Insert obligatory reference to Alien here.)

Parasitoid wasps are thought to be hugely diverse, in part because coevolutionary interactions between larvae and their hosts’ immune systems might force each wasp species to specialize on one or a few hosts. Smith and coauthors use barcoding based on nuclear and mitochondrial DNA to determine the diversity of braconid wasps within a Costa Rican conservation area, comparing the results to those produced from a traditional taxonomic survey. Traditional methods found 171 potential species – and barcoding turned up another 142! These additional species were basically identical to the eye, but in many cases they’re actually collections of similar species using different hosts.

So not only are there more wasp species than traditional methods would detect – they’re more specialized than we’d know without barcoding. DNA Barcoding can make some biologists (including me) a little squeamish; it’s worrying to picture a world where no one really knows the organisms they study except through DNA sequence data. But Smith et al. are applying the method to find diversity that would probably not be detected in any other way, with results that bear directly on how we think about the interactions between parasitoids and their hosts. That’s unquestionably a good thing.


P.D.N. Hebert, A. Cywinska, S.L. Ball, J.R. deWaard (2003). Biological identifications through DNA barcodes Proc. Royal Soc. B., 270 (1512), 313-21 DOI: 10.1098/rspb.2002.2218

M.A. Smith, J.J. Rodriguez, J.B. Whitfield, A.R. Deans, D.H. Janzen, W. Hallwachs, P.D.N. Hebert (2008). Extreme diversity of tropical parasitoid wasps exposed by iterative integration of natural history, DNA barcoding, morphology, and collections PNAS, 105 (34), 12359-64 DOI: 10.1073/pnas.0805319105