Biologists start sharing unpublished work—oh, the horror!

Most of you have probably heard of by now. Basically, every paper I report on is probably on the arXiv in some form. The site hosts draft manuscripts in physics and astronomy. Many of them end up eventually appearing in academic journals, although others will spend their entire existence in the arXiv.

Five years ago, I assumed that every field had its own equivalent of arXiv. So imagine my surprise when I tried to upload my first chemistry paper to the chemistry equivalent. Apparently, there was one—but Elsevier eventually gained control of it. The publisher decided that access and required registration. The site is probably a wasteland by now.

Until recently, biology hadn't even managed to create anything that Elsevier would want to acquire. But now, finally, biologists are getting on the pre-print train with a bioRXiv.

This has been met with... mixed feelings. For reasons that are not clear to me, I follow a lot of people in the biomed and bio fields on Twitter, and I have been rather bemused by the reactions to having bioRxiv thrust upon them. From an outsider's perspective, it seems like an obvious good thing, but not everyone is responding as such.

I think the process by which the bioRxiv was introduced is causing the problems. In physics, there wasn’t any real pressure to use the arXiv. Initially it was a simple e-mail list. As it expanded, it became a repository. It was originally set up to serve the small group in the field of theoretical physics (mostly astronomy as far as I can tell). But paper sharing simply made sense—journals took a long time to publish, so researchers ended up e-mailing preprints to all their friends.

From this small beginning, the arXiv has grown to encompass all of physics. Now, just in the optics section, some 20-30 papers are added every day. Some papers don’t even make it any further. They accumulate reputation and importance while sitting on the arXiv “unpublished.” Frankly, I don’t have a problem with that anymore (I used to).

At this point, putting your draft on the arXiv first is supported by many journals, to the extent that submitting to any of the American Physical Society journals is a lot easier if you have already put your paper on the arXiv. Even better for some communities, arXiv is used as a pre-submission peer review. You put your paper on the arXiv and let your colleagues know. Anyone interested in the paper will read it and, if they feel the urge, comment on it.

This effectively short-cuts peer review. After a week or so, you submit it to a journal and the editor sends out for a more organized peer review process. For fields like cosmology or particle physics, your peers have probably already read it and have already let you know what they think; the journal just formalizes this. But the informal peer review may have an advantage, as it is more likely to be done by people who care about the paper's contents.

There's another way to look at the effect of this informal peer review. ArXiv papers start with no reputation at all. This is different from an article that gets published in Physical Review Letters. By virtue of the name of the journal, the work starts off with a better reputation than the same work published in, say, Optics Letters. This is true even if the content is exactly the same—the publisher makes the difference.

But if the manuscript is never published, then any prominence it attains is purely due to its content. Any paper that acquires a significant reputation on the arXiv has done it by virtue of being truly important (or, in some cases, truly infamous).

This threatens some scientists. Their reputation and identity is firmly stapled to the mast of certain journals. They have, over the years, acquired the skills and techniques required to ensure that a large fraction of their work appears in these journals. Unfortunately, these skills and techniques overlap with, but do not encompass, the skills required to do great science. Such a change in the playing field would remove one of their defining advantages as scientists.

For physics, this process has already started. Even though publication in journals is still dominant and necessary, the arXiv is changing that landscape. Biologists, however, do not have this tradition yet.

This doesn't mean biologists are especially touchy about reputation. Frankly, if the arXiv were introduced in its present form to the field of physics back when it was started, similar worries would've surfaced. Remember, arXiv started as a way to share papers before publication, and it was never intended to replace (or even streamline) publication. Many would argue that it still isn't intended to replace publication, but I think that ship is getting ready to raise the gangplank.

I suspect theft and priority are some additional fears for those worrying about these sites. I put my paper on the arXiv, then submit it to a journal. A rival submits the same work directly to a journal. Who has priority?

Given some of the behavior we've seen among scientists already, you know it will happen: someone with the ethical sense of a dead fish will start trawling the arXiv for yet-to-be published work. They will change the names and affiliations of the authors and submit directly to a journal.

In physics, because the arXiv grew out of a small community, this was a minor fear. From there it grew to encompass new fields of physics, which required some significant changes. Now, at least the submitting author must be registered. And the arXiv can flag suspicious activity and require that those authors be endorsed by someone trusted by the arXiv. It also has an army of volunteers that do a cursory check of all submitted papers and can reject or reclassify papers. This combination reduces the chance of theft quite significantly.

Unfortunately, that culture and infrastructure don't exist at the bioRXiv yet. This is also not a case of something growing out of a small, single field—it is being dropped on the world's largest research community. BioRXiv's culture will be determined over the next few years by its users and its abusers. The cynical amongst us believe the abusers will win out.

But I mostly think that the cynical and skeptical are wrong. Biology is not so different that biorXiv is destined to fail. I know it is a scary new world for biologists and that they may get a few scratches. However, it is also ripe with opportunity. Stop worrying and start sharing.

