Unordered thoughts on the Pruitt situation

If you’re not one of those folks who pays close attention to social media and the lil’ blogosphere of ecology and evolution, it’s possible you haven’t heard about this, yet. But I imagine you will, soon enough. Before this ends up in the pages of Nature and Science and the New York Times, I have some thoughts I’d like to share (though not in any particular order), but first, I’ll give you the lowdown.

If you don’t know of Jonathan Pruitt: He’s one of the bigwig Canada 150 Research Chairs, with his lab at McMaster University, at least for time being. In the course of his short career, he’s been highly productive in the area of the evolution of social behavior. I don’t know him personally (actually, I don’t think we’ve even met), though I know several folks who have worked with him quite well. I’ve had conversations with friends who have observed his meteoric rise*, fueled by a string of high profile papers with evidence for cool theories, and have felt some combination of admiration, imposter syndrome, and wonder at how he manages to do it all. Recent revelations speak to that wonder.

To make the long story short, a few of his papers have just been retracted for unexplainable irregularities in the data that aren’t credibly explained by biological phenomena, and it looks like plenty more will be retracted in the coming weeks and months. For a more detailed recap, Dan Bolnick is on the case, which makes sense as he’s the EIC of American Naturalist, which is where the inquiry about data irregularities began, and where retractions occurred. This has been covered by Retraction Watch, and I’m sure we’ll see a lot more from them on this. Bolnick has operated his own blog for a good long while, and he’s got posts there about what some folks have taken to calling “PruittGate,” with Part 1 and Part 2, and I imagine more parts to follow as well. From where I’m watching it, it looks like Bolnick is performing due diligence as required, following appropriate procedures, is being as transparent as is possible. Based on the data and publication forensics by Bolnick and others including Jeremy Fox, it appears that datasets with the unexplained anomalies are those which Pruitt collected, and so far, data which he has not collected have not have anomalies detected. The google spreadsheet that Bolnick set up to track the status of papers which Pruitt has authored is here (and it’s currently getting far more traffic than the eco/evo jobs board, which is quite something.) More than a dozen papers have had anomalous patterns in the data identified, and a bunch these have been retracted or are in the process of being retracted. Oh, and there are discussions on PubPeer too.

Several scientists who have worked and published with Pruitt have issued a short statement, published on Dr. Ambika Kamath’s blog, in which they explain how they’re being proactive and are working to setting the scientific record straight. I can only imagine how difficult it must be to be in the position as a junior scientist, and it is heartening to see that our community is supporting the efforts to treat all parties fairly and make sure that those who are not responsible for any potential misconduct are not suffering adverse consequences.

While the retractions at American Naturalist have been in the works for a while, news spread yesterday, as a result of a highly detailed blog post by Dr. Kate Laskowski. She is a faculty member who worked with Pruitt when she was working on her PhD, and she walks us through, step by step, how she was notified about and explored the anomalies in the dataset that she used for her paper, using data provided to her by Pruitt. It’s not a short read, but if you’re inclined to form an opinion about this entire affair, then I think that post really needs to be part of how you form that opinion. Just read it for yourself, I really don’t want to be in the position of telling you what to think when you have what you need to make your own mind up, or perhaps that you don’t know enough to make up your mind.

So, that’s my attempt as summarizing the current state of knowledge. Here are those thoughts I’d like to share:

We should take our time and be deliberative, because there are a lot of people wrapped up in this who don’t deserve adverse consequences. I think this is a good time to put faith in the editors to get this right. There’s going to be a lot of outrage, this will be covered in major media outlets, and people will be talking about this for a while. I’m writing this post not to amplify the hubbub, but rather, because it’s already making its own weather, I want to support our peers who haven’t done anything wrong don’t experience adverse consequences from all of the sound and fury.

Our concern for fair treatment and safety of people involved should extend to Dr. Pruitt. Journals and institutions have procedures for these circumstances, and we should let this play out. By all means, inform yourself, and listen what involved parties have to say, and we don’t have to sugar-coat the affair. I think it’s okay to talk about it, because it really is a big deal that affects our entire community. And also let’s remember that we all live complex lives, we all have people who love us and we all have others who we care for. It harms the process of science if we make this unnecessarily personal or a matter of perceived retribution, and as we discuss the matter, the constructive approach for action is to modify our own practices for greater data transparency. Keep in mind that the process we are going through now is a part of science and peer review, and perhaps it’s even a bit reassuring that this is coming to light, as a form of validation that our system actually works once in a while. I imagine that Dr. Pruitt has secured legal counsel or is about to do so, and that seems to be the most reasonable course of action on his behalf, so I’m not going to fault him if we don’t hear anything more directly from him for a while. This will take a while to play out, and there is no real urgency in setting the record straight. This appears to go back years, and we’re not going to settle it in a mere matter of weeks.

For a while, I thought that this affair was only going to directly impact others, but after a time lag of about 24 hours, I realized that it’s affected even myself, a little bit. It turns out that Pruitt and I were both middle authors on a paper that came out a few years ago, that was initially off my radar. I contacted the senior and corresponding author of that paper, and he let me know that the Pruitt didn’t have a role in collecting, managing, or directly analyzing the data, and I updated the spreadsheet with this info. Then, I got an email from the EIC of Insectes Sociaux, for which I serve as AE. She reminded me that I handled a manuscript for which Pruitt was the senior author, and that we should look into this. Oh, my! This literally had not occurred to me. Now, we are on this matter, and I believe the journal is handling this well. (Since these publications are an overt matter of public record, I don’t think mentioning this is inappropriate, though I don’t think I should be saying anything more about it at this point, without consulting with the others involved.)

I think these revelations will have an impact on our scientific community in one particular way: increased data transparency and public data archival. All of this was precipitated by a third party who detected anomalies in data that were deposited in Dryad, and then resulted in contact with the editor and the first author. If Am Nat didn’t require data deposition in Dryad, would we know about any of this? I think there are good reasons, in some circumstances, for authors to keep their data to themselves for some embargo period, but when a paper gets published, I think it’s only reasonable given the way the internet works that the associated raw data should be publicly available. (I think when it comes to the protection of threatened organisms, IRB, community participants, and such, there are extenuating circumstances, but that generally doesn’t apply when you’re watching spiders behave in the lab.)

The article that I handled with Pruitt as a last author was for a journal that has not (yet) adopted a policy for public data archival. Which means that when this news came out, I didn’t have the associated data file at hand, and this complicates things al little bit. I imagine that this will shine a light on the need for all journals to expedite policies for public data archival.

Folks are keen on sharing code and various tools and approaches to detect nonbiological abnormalities in biological datasets. Which is fine. But let’s keep in mind that if we start distributing tools to detect abnormalities in data that suggest fabrication or manipulation, that these tools presumably will be favored by those who are fabricating or manipulating data to ensure that they’re not detected. (This is just like how contract plagiarists use Turnitin to make sure that their own work doesn’t get dinged by Turnitin when their clients submit the work.) This isn’t an argument against such tools, I’m just pointing out that this isn’t going to fix the problem. This dynamic is an evolutionary arms race and there isn’t a technological fix to the situation — it’s addressed by a healthy professional culture.

I think this series of events is, to a great extent, the product of our messed up value system and our emphasis on publication metrics in the measure of a scientist. For example, people discussing this affair are quick to point out that the potential damage to junior scientists is that they’re losing papers from their CVs, which harms their job prospects. Of course, this doesn’t change their merit as a scientist by one iota – and if we are assessing one another fairly, then this shouldn’t really matter so much. One well-regarded observer noted that one of the impacted scientists has a newish position at a “top-notch institution.” This tacitly implies that the prestigiousness of a scientist’s institution has bearing in this matter. This kind of elitism that feeds into the value system that incentivizes misconduct. As long as we use the prestige of the institution where a person works, number of papers they produce, and where they are published, as surrogate measure for the importance or value of a scientist as a peer in the research community, we’re doing it wrong. (Which is, after all, a central reason why I created this blog, to get this particular message out there.)

And I think that’s it. This was a relative stream-of-consciousness account of thoughts I thought might be of interest or use. I hope you have a pleasant weekend.

*What the heck is a meteoric rise? I mean, don’t we really only see meteors when they’re falling? How did this turn come about? It makes as much sense as “head over heels.”

I am now retired from science, and was involved in a totally different field… biomedical sci and cell signalling. I have, however, seen two instances of data fabrication close up at my last place of work. I’ve followed this one ( and others) quite closely. What struck my is that for a person in this type of job to generate so much raw data from directly observing spiders must have taken hours. If this guy HAD generated so much data then people working with him must have been able to see that he spent hours a day, every day directly observing spiders. If he did not spend his time doing this….then people directly working with him MUST have known that this data was appearing from thin air.

This has parallels with my own experience. We had an incredibly productive PhD student who was producing huge amounts of data. Most of his experiments involved using a particular piece of equipment to generate the primary data. At that time I was a reader in a UK university with an outstanding reputation for biomedical sci ( a reader in the UK is a kind of senior associate prof). My students and post docs came to me in a delegation….and explained that the incredibly “productive” student had never been seen near the plate reader in months. They were very angry and told me point blank that they thought he was slimy fabricating data.

I checked with our chief technician….using this machine extensively would involve the purchase of substantial amounts of consumables. Nothing had gone through the books.

A clearer cut case I couldnt imagine. The bottom line is the lab head and university senior management were totally uninterested. When challenged the student said that those experiments had been done by a collaborator in poland. This explanation was immediately accepted and NOBODY did anything to check…. for example, had data files been emailed to him from poland?? All of our students were required to use university emails for all project related matters. A simple call to IT would have answered that question within minutes. People simply did not want to ask the question.

There were remarkable parallels with an retraction from a research centre in the Baltic. People will find it via google…. as I recall sticklebacks would devour plastic nano particles. Once this had happened, they then became more vulnerable to predation by perch. The only problem was that this was a huge study that would have involved setting dozens of experimental aquariums, multiple trips to catch fish etc etc. Nobody working in the institute had any memory of any other this happening and, by and large, the institution did not want to see a high profile paper retracted.

I’m afraid I came to the conclusion that a significant fraction of the data behind many papers in many fields is simply made up. Multiple examples on retraction watch.

I take my hat off to the young scientists who have actively publicised this fiasco!!

I took early retirement from science, despite becoming a full.professor very late in my career. My decision to to this was motivated, in part, by growing disillusionment caused by events such as this. Nevertheless….I did enjoy working as a scientist and totally believe in the scientific world view. For what it’s worth…..i am now a music student!!!

8 thoughts on “Unordered thoughts on the Pruitt situation”

Manu Saunders 6 years ago

Good thoughts. I’ve been watching this story with interest. I have no connection with any of the people involved, haven’t heard of most of them, haven’t read any of the paers, it’s not my field of research. But the story does affect all of us, from a science ethics perspective.

One thing I’ve been thinking while watching this story unfolding, is the important role of blogs and social media in this (eg Bolnick & Laskowski posts). They’ve provided a platform for the issue to be transparent, empowered the authors & editors to take control of their situation & reduce the risk of rumour spreading, & enabled them to find new massive support networks they might not have had access to if deaIing with this in private. I don’t think we thought of this added benefit to ecology community blogs when writing our paper!
Robert Arlinghaus 6 years ago

I like this list a lot. I would also like to ask something obvious. If the data fabrication indeed was so dumb and almost obvious, and so widespread, why has this gone unnoticed by the co-authors, of which there were many. Sure, nobody of us expects that co-authors sending us data fabricate anything. Yet, duplicating values, decimals in seconds, formulas in excel sheets – data tables that were not that voluminous – all that could be detectable, even by chance, by the many co-authors involved. So, why did this not happen? I think this question needs to be asked as well. And I am not seeing it raised at all.
Pingback: Who can we trust? | Small Pond Science
Nick Brown 6 years ago

I’m not as concerned as you with the idea of a forensic tools arms race. I’m guessing that most of the people who base their career on faked data don’t spend a lot of time reading up on that (meta-)subject. If you want to fake results convincingly and undetectably, there’s a fairly simple way: run the experiment, get real data, and just switch the conditions for a few subjects that have the “wrong” results, until you get a “Goldilocks” effect size. I suspect that the competent fraudsters are already doing that, and chuckling to themselves at the amateurs who make up fake descriptives with mathematically impossible means or copy/paste rows of data within Excel sheets.
Stuart 6 years ago

I am now retired from science, and was involved in a totally different field… biomedical sci and cell signalling. I have, however, seen two instances of data fabrication close up at my last place of work. I’ve followed this one ( and others) quite closely. What struck my is that for a person in this type of job to generate so much raw data from directly observing spiders must have taken hours. If this guy HAD generated so much data then people working with him must have been able to see that he spent hours a day, every day directly observing spiders. If he did not spend his time doing this….then people directly working with him MUST have known that this data was appearing from thin air.

This has parallels with my own experience. We had an incredibly productive PhD student who was producing huge amounts of data. Most of his experiments involved using a particular piece of equipment to generate the primary data. At that time I was a reader in a UK university with an outstanding reputation for biomedical sci ( a reader in the UK is a kind of senior associate prof). My students and post docs came to me in a delegation….and explained that the incredibly “productive” student had never been seen near the plate reader in months. They were very angry and told me point blank that they thought he was slimy fabricating data.

I checked with our chief technician….using this machine extensively would involve the purchase of substantial amounts of consumables. Nothing had gone through the books.

A clearer cut case I couldnt imagine. The bottom line is the lab head and university senior management were totally uninterested. When challenged the student said that those experiments had been done by a collaborator in poland. This explanation was immediately accepted and NOBODY did anything to check…. for example, had data files been emailed to him from poland?? All of our students were required to use university emails for all project related matters. A simple call to IT would have answered that question within minutes. People simply did not want to ask the question.

There were remarkable parallels with an retraction from a research centre in the Baltic. People will find it via google…. as I recall sticklebacks would devour plastic nano particles. Once this had happened, they then became more vulnerable to predation by perch. The only problem was that this was a huge study that would have involved setting dozens of experimental aquariums, multiple trips to catch fish etc etc. Nobody working in the institute had any memory of any other this happening and, by and large, the institution did not want to see a high profile paper retracted.

I’m afraid I came to the conclusion that a significant fraction of the data behind many papers in many fields is simply made up. Multiple examples on retraction watch.

I take my hat off to the young scientists who have actively publicised this fiasco!!

I took early retirement from science, despite becoming a full.professor very late in my career. My decision to to this was motivated, in part, by growing disillusionment caused by events such as this. Nevertheless….I did enjoy working as a scientist and totally believe in the scientific world view. For what it’s worth…..i am now a music student!!!
Stuart Wilson 6 years ago

https://www.sciencealert.com/a-widely-reported-study-on-the-effects-of-microplastics-in-fish-is-about-to-be-retracted

Perch fry not sticklebacks….
Pingback: Recommended reads #168 | Small Pond Science
Pingback: Reproducibility of high resolution reconstruction – one year on | Musings on Quantitative Palaeoecology

Comments are closed.

Sharing is caring:

Related

8 thoughts on “Unordered thoughts on the Pruitt situation”