The Molecular Ecologist: Using R to model the spatial distributions of species

Environmental variation across the range of Joshua tree. Image via The Molecular Ecologist.

This week at The Molecular Ecologist, I’m showing how to use the popular open-source statistical programming language R to estimate species distribution models.

Species distribution models (SDMs) are handy any time you want to extrapolate where a species might be based on where you know it actually is. Maybe you’re trying to figure out where would be fruitful to do more sampling; maybe you want to know where your favorite critters probably lived back during the last ice age; maybe you want to know what regions will be suitable for your favorite critters after another century of global climate change.

Given how widely useful SDMs are, it’s very nice to be able to estimate them using multiple methods implemented within a single open-source framework. To get a taste of the capabilities provided by R and a select set of add-on packages, go read the whole thing.◼


First teaching experience: Final examination

2013.01.05 - Reem-Kayden Center The Reem-Kayden Center for Science, Bard College. Photo by jby.

Twenty-one days, 12 schooldays, 24 class periods, 54 hours of class time … and now the 2013 Citizen Science course at Bard College, my first attempt at teaching all on my own is over. Actually, it’s been over for a couple days. I’ve flown back to Minneapolis, unpacked two suitcases full of laundry and books, spent a day at the office picking up the threads of work I left behind for a month, cleaned and restocked the kitchen, and posted photos from my weekend off in New York City.

Oh, and submitted the final grades.

But so now that it’s all over, how’d it go? Pretty well, on the overall. As much as Citizen Science is meant to be a crash course in scientific reasoning for Bard’s first-year students, it’s also a crash course in teaching for folks like me, who come to the job with experience as teaching assistants, but not in planning or executing a whole course. And judged solely on that level, Citizen Science is amazing.

Let me run through the numbers again: 12 four-and-a-half-hour days with the same 20 first-year students. I spent a fair bit of my Christmas holiday preparing lesson plans, and ended up reworking almost all of that planning in the last three days before class started. From there on, the average workday was something like:

  • 0700-0800h: Wake, shower, breakfast at cafeteria.
  • 0800-0900h: Last-minute lesson prep; classroom set-up, maybe some frantic final copy-making.
  • 0900-1130h: Morning class period. Ideally, no more than one hour of this is PowerPoint presentations and/or videos of TED talks.
  • 1130-1200h: Clean up, collect oneself, wait for the crush of students to move through the cafeteria.
  • 1200-1300h: Lunch at the cafeteria.
  • 1300-1500h: Afternoon class period. Only start this with a video if you want everyone to immediately fall asleep. Class debates are good in this time slot. Assign homework for the next day.
  • 1500-1600h: Clean up, collect oneself, adjust tomorrow’s plans based on what you covered today.
  • 1600-1730h: Exercise. (There’s a respectable campus gym, or nice trails if the weather’s not terrible.)
  • 1730-1900h: Dinner at the cafeteria.
  • 1900-whenever it’s done. Lesson planning and prep; printing and copying of handouts.
  • 2300h: Bedtime, one hopes.

With variations for a four-day rotation in the wet lab and another in the computer lab, plus a “civic engagement” day in which the first-year students go to a local public school to guest-teach science classes for half a day, that’s pretty much the shape of the course. It was exhausting. Boot camp for college teaching. Learning to swim by jumping into the middle of the Hudson River in January.

But that schedule leaves out a multitude of support. First and foremost, Citizen Science faculty have no other personal responsibility than the teaching. Meals are in the campus cafeteria, which provides just fine. Housing is on campus—yes, my dorm room was tiny and ill-equipped, but it was also right around the corner from my classrooms, the communal faculty workspace, the cafeteria, and the gym. So: no cooking, no commute.

Also, it must be said, the Bard student body is pretty great. There were the inevitable exceptions, but most of my class section were smart, friendly, and willing to at least try to tackle any topic I threw at them. Sometimes they were alarmingly informal, and I had to bend a little to accomodate the local concept of punctuality, but if a classroom full of unknown students is a cliff from which a rookie prof dives, these students were also the trampoline at the bottom.

But most importantly, Citizen Science teaching is collaborative. Intensely collaborative. From the moment I arrived on campus, most of my conversations with other faculty members were about lesson plans: what had worked last year, what spurred an amazing class discussion earlier today, what part of the lab procedure left every student confused and irritated. We all started with a six-inch-thick binder of readings, case studies, and worksheets, and then added our own ideas—and swapped, reworked, cut, and rejiggered each other’s ideas.

2013.01.14 - Running Running the campus trails. Photo by jby.

For me, the flagship example of this was the computer lab. The resource binder had some material on SIR models of disease spread in a population; I wanted to try and teach my students some of the programming language R. So why not build SIR simulations in R?

One faculty member had already developed a nifty interactive model of disease spread in a simulated social network, which included many of the basic concepts necessary to understand more general models, so I started the computer section with that. Next up was an intro-to-R worksheet I’d banged out over the holidays, which covered exactly the programming concepts necessary to code the model, and nothing more. A couple of other faculty members test-drove that worksheet in their own class sections, which had the computer lab earlier in the schedule than mine.

One night’s reading assignment was Anderson and May (1979) [PDF], the original SIR paper; the next day we walked through the math in class. Then I gave my students a worksheet covering some of the graphing capabilities of R, which another of the R-using faculty had developed as followup to my introduction worksheet. And finally, I walked them through the coding necessary to create a simple SIR recursion simulation, complete with a plot of populaiton dynamics over time.

The result wasn’t unqualified success, by a long shot. Some students bogged down in the programming; many glazed over when I started writing equations on the whiteboard. Almost everyone seemed to like drawing graphs in R, though a lot of folks got frustrated by the technicalities of programming syntax even in that context. In the end, most students were able to at least follow me through coding the SIR model, but that was all we had time to do. Given another go-around, I’d provide more structure in the final stretch, with a worksheet that walks through the model coding and how to use the finished model to test specific hypotheses about epidemic dynamics. Also, I’d probably lead with the graph-making, which was more engaging than just pushing variables around on the command line.

But on the whole, I think it worked. My students coded SIR simulations in R, which actually responded to parameter changes the way they were supposed to, and generated pretty graphs in the process. Several students even told me, afterward, that they’ll use R for graphing in the future.

That outcome was really only possible because there were other faculty working on similar ideas, testing things out for me, sharing their own experience and materials. From what I hear, that’s a resource I can’t expect to have when I start teaching my own “real” courses as a full-fledged faculty member. And yet it’s the biggest reason why Citizen Science left me feeling like, actually, I might be able to pull off this whole professor-ing thing after all.◼


How I spent (the first weekend) of my summer not-quite-vacation

Phylogeny of the Prodoxidae, the family of the yucca moths, with a (very basic) reconstruction of pollinator life habits. Image by jby.

Late last night I made it back to Moscow from (mostly) sunny Santa Barbara, California, where I was lucky enough to attend a summer short course in phylogenetic comparative methods using R, sponsored by NESCent, hosted by NCEAS, and helmed by Luke Harmon and Mike Alfaro. I came into the seminar as a big fan of the programming language R already, and it was great to learn about a whole new range of tools available for the platform. It was even better to learn about those tools in a group of really smart colleagues, all of whom were thinking about how best to use R in their own projects. It was like a warm-up for the Evolution meetings, which start this Friday.

Like the meetings, one of the principal pleasures was learning about everyone else’s study organisms, the best example being the wrinkle-faced bat, which has the strangest trait I think I’ve ever seen in a mammal: a bald face, and a “mask” of furry skin it can pull over said face. Flickr has photos! I’ll put one below the fold in deference to the squeamish.

Eat your heart out, George Lucas. Photo by Evets Lembek.

The math-challenged scientist’s best friend

The NY Times has a neat piece about R, an open-source statistical programming language used by scientists worldwide. I’ve used it quite a bit myself, though I’ve hardly scratched the surface of its capabilities. The graphics package alone kicks Microsoft’s arse. Thanks to its price (free), its ease of use (spectacular), and a thriving developer community, R is apparently gaining ground on the commercial competition, the clunky, overpriced SAS.

Posted in Uncategorized | Tagged ,