Summer is sometimes a contemplative time for me. It used to be long hours in the field would give me time to think but now it is just as often that I’m weeding my garden or some other summer activity. Lately I’ve been thinking a lot about negative results.
I have three projects that I am struggling with in various ways because of their ‘negative result’ nature.
The one that has lingered the longest was once apart of my PhD dissertation. I was looking at phenotypic selection across space and time to assess whether selection was more variable in either. It matters when thinking about evolution in general because selection that varies across space is more likely to lead to local adaptation while variation across time is more likely to maintain trait variation in a population. It also matters on the small scale of what it means when individual researchers measure selection. How representative is that measurement? Since few studies have done these measurements across both space and time, I thought my efforts would be a great contribution. And with one of my committee members we also summarized the literature that had measured selection either in multiple populations or years. Long story short—there was lots of variation but neither in my dataset or the literature was it more variable across space or time. An essentially negative result, although a real one. I’ve since added more data and resolved that this is the last time I tackle writing it up. It is going to get published this time I swear! But it is surprising difficult to write about something with no difference, even if that lack of difference shows something important.
Next we have a paper with my former PhD student who was working further on understanding floral traits in our system of Penstemon digitalis. We were asking whether a floral scent that is emitted from nectar was an honest signal of nectar to pollinators. The long story short there is that it’s complicated. If we had done fewer experiments there probably would have been a beautiful high-ranking journal article to come out of this work because some of the pieces fit together so nicely. Then we had to go and do more and that beautiful story becomes “essentially a negative result”*. We’re at the second journal with the study and we’ll see what comes back from these next reviews but it is tough. I believe that the story we’re telling is honest (haha) and looks at honesty more deeply than it is often done but the answer we get back is complicated. Those complications make it tough to write about and show that we need a lot more work to figure out everything that is going on in that system.
Finally, we’re working on plant-pollinator interactions in Swedish towns and some surprising results have come out there as well. My first master’s student basically found no real differences in pollen limitation or fitness across an urbanization gradient. This summer there is another student taking up the question and expanding the studies to a few more towns and repeating the survey done before. I’m not 100% convinced of this negative result as I am of the others. It might have to do more with the sampling effort but if we find the same thing this year I’ll have to concede that it is real. And then the difficulty of writing up another set of essentially negative results.
Nature is complicated. Rarely do things fit exactly into our nicely thought out theories or hypotheses. We should expect that our systems don’t always line up with the theory. When the trend is clearly in the opposite direction, eventually that makes into refining the theoretical framework of the field (we hope). But what about when we expect something to have an effect and it doesn’t? The first thing to do of course is to make sure that our design is robust and the lack of effect isn’t due to some failing of our studies. But what if that checks out?
I suspect there is a lot of important information out there buried in files of experiments/studies where no difference was detected. I am hardly the first to make this observation. Long ago when I began my career, I had lofty ambitions to never let data languish. But now I can see why these kinds of datasets do. My own have. It takes a lot more to write a convincing paper about a lack of difference than a difference. Storytelling is easier and neater when the facts fit the theory. It isn’t just reviewers and readers that have a more difficult time with negative results but I’m finding as a writer it is also much more difficult to process and, well, just get on with it and write these stories.
I don’t have any answers here but it seems to me that we should make an effort to tell the less clear tales of our data adventures. My big push to get the variation is selection paper out finally will be my attempt at this. I’m finding more blocks to my writing than usual but I think (hope) I will eventually manage. Hopefully the editors and reviewers will appreciate the difficult tales I am working on. Because it is really tempting to set them aside for the more simple stories I have waiting to get out as well.
As for those little wee datasets and observations that aren’t enough for a paper. I’m going to try follow in some other bloggers footsteps and try to highlight them in future blog posts so they see some light of day.
*It bothered me when a coauthor referred to our results as such but then I realised they were right. It started my internal struggle with thinking about the real issue I have with these papers. Negative sounds, well, negative as in bad. I think it is hard to not have that connotation in your mind and nobody wants to think about their data as bad. While I do have data that is bad, as in not reliable for various reasons and therefore I’m comfortable throwing out (although it is not easy and frustrating), when the data are ‘good’ but negative it is different. These are the data that I’m finding difficult to write about and I’m guessing sit around in many a scientist’s ‘file cabinet’.
4 thoughts on “The writing curse of negative data”
Great article! This really seems like one of the most entrenched issues in science (at least among the more-or-less wholly “academic” issues). I am particularly interested in one aspect represented by two excerpts in the piece:
“It might have to do more with the sampling effort but if we find the same thing this year I’ll have to concede that it is real.”
“The first thing to do of course is to make sure that our design is robust and the lack of effect isn’t due to some failing of our studies. But what if that checks out?”
What I think is interesting is that negative results – even among what I’ll call “negative results advocates” such as yourself (and myself) – are consistently viewed with greater suspicion of robustness than “positive” results, and this makes no sense to me. Are negative results any more or less warranting a second look, greater sampling effort, or more robust design than are positive results? It is not just that robust negative results have trouble finding a home in good journals, it’s that any negative result typically has to pass a higher bar to be accepted by the community than do positive results.
But, perhaps there is something statistical I don’t understand yet about why negative results must meet a higher standard than positive results in order to be “believed”!
So, maybe this internal (and external) bias is one reason we have trouble writing papers about “negative data” – an unreasonably biased, greater suspicious towards negative versus positive results.
This is a vexing issue! At the start of those projects, there was real promise that “something real” (i.e. significant) would be there, and when there isn’t (which happens ~most of the time) there is incredible pressure to find something, anything, that is significant, and to treat those as a-priori hypothesis tests. That process is time consuming, wasteful, and ultimately detrimental. However, there IS a solution! Registered Reports assess the theoretical importance of the issue priori to seeing the results. If the effect “should” be there, but isn’t, why is that? Answering that in a way that is faithful to the tools of statistical inference can only be done before realizing the results.
By submitting research plans and detailed analysis proposals to a journal before results are known, important improvements can be made to the design when there is a chance that those improvements can actually improve the science. Studies that are of theoretical interest AND that present plans that are able to interpret results regardless of outcome (such as being sufficiently powered and that include some sort of positive control or other quality control measures that ensure that the methodology was competently conducted) are given an “in principle acceptance” which is a promise to publish regardless of outcome. That IPA is contingent on completing the study as promised and on passing those quality checks.
Read more about RRs here: https://cos.io/rr/ to see the current list of 62 journals accepting them, or consider asking any other journal to accept a Registered Report https://osf.io/3wct2/wiki/