Scientific methods in the genomic age

Nature Methods has a good editorial considering the issues around defining what science is in the age of exploratory genomics [$-a].

As schoolchildren we are taught that the scientific method involves a question and suggested explanation (hypothesis) based on observation, followed by the careful design and execution of controlled experiments, and finally validation, refinement or rejection of this hypothesis. … Scientists’ defense of this methodology has often been vigorous, likely owing to the historic success of predictive hypothesis-driven mechanistic theories in physics, the dangers inherent in ‘fishing expeditions’ and the likelihood of false correlations based on data from improperly designed experiments.

Their conclusion is that hypothesis-driven science will absorb the the current flood of genomic data as the basis for new hypotheses to direct future large-scale data collection:

But ‘omics’ data can provide information on the size and composition of biological entities and thus determine the boundaries of the problem at hand. Biologists can then proceed to investigate function using classical hypothesis-driven experiments. It is still unclear whether even this marriage of the two methods will deliver a complete understanding of biology, but it arguably has a better chance than either method on its own.

As I’ve said before, massive genomic datasets change science mainly through their quantity, not their quality. On the one hand, science has always involved undirected observation – Darwin didn’t have any strong hypotheses in mind when he hopped aboard the Beagle. Classical natural history is a discipline devoted to almost nothing but undirected data collection, and it’s been the grist for evolution and ecology research since the beginning of time. On the other, it seems to me that genomic “fishing expeditions” are more hypothesis-driven than we realize, even if the only hypothesis is “Neanderthal genomes will be different from modern humans.”