Friday, 19 April 2024

The state and prospects of academic peer review

(A PDF version of this post is available on HAL.)

Peer review plays a crucial role in the communication and management of research. How well is it working? How could it be improved? The answers depend on whom you ask. Here they are debated by seven characters with different points of view.

1. Osman: Now that everyone is here, let us start this debate on academic peer review: should it be strengthened, overhauled, or abandoned? When I arrived, Clementine was already in the room, reviewing a paper and complaining about it. This leads to my first question: is peer review taking too much time and effort? 

2. Clementine: I was indeed complaining, but like most colleagues I realize that peer review is an essential part of our work: this is how we validate each other’s work, this is how we assess its interest and likely impact, and this is how we collectively improve how papers are written. Now, the problem is that so many papers get submitted nowadays, it becomes hard to do a proper job on all of them. Even after declining many invitations to review, I have to be pragmatic and focus on the main question: is the submitted paper worth publishing in that journal? If the paper is obviously good work by well-known people, it can be waved through with minimal scrutiny. If it is mediocre work, the rejection should be short and final – I do not want to get dragged into a technical debate with the authors. If however I am unsure whether the paper should be accepted after a quick first reading, then I need to scrutinize it, and to find a good reason for either rejecting it, or accepting it after some improvements. 

3. Deirdre: So you spend more time reviewing second-rate works than really interesting papers? I am glad I stopped reviewing years ago. 

4. Clementine: That is your choice, but we would be better off if everyone were doing their fair share. Surely you expect that the papers you write get reviewed and published? 

5. Deirdre: I don’t. The arXiv is enough for me. I do not publish in journals. 

6. Ingrid: How I envy your freedom! As a tenured professor and well-known scholar, you can afford to do things that a PhD candidate like myself can only dream of. To get my degree and apply for employment in academia, I need publications in highly-ranked journals. Otherwise, how would I show off my work? How would I get a position? For that matter, how would you assess my application, if you were on a recruiting committee?

7. Deirdre: Recommendation letters. Studying your work. Discussing with you, if possible. I do not need journals to know who the good young researchers in my field are. 

8. Severin: That is indeed how things were done many decades ago. However, the number of researchers and articles has increased a lot. Researchers and administrators are asking for ever more sophisticated tools, which can generate and integrate lots of data. This includes bibliometric data on publications and citations, which ultimately come from journals. Accordingly, in the academic publishing industry, our core editorial activities are increasingly complemented by information analytics. Your radical stance raises another worry: you may be saving time, but aren’t you missing out on the valuable scientific discussions that are part of the publishing process? 

9. Deirdre: Such discussions have been occurring for as long as research exists. When I debate with colleagues, privately or publicly, I do not need a journal to serve as a go-between. 

10. Esteban: Apparently you are thinking of legacy journals, with their slow editorial process, and their lack of openness. However, online journals could do much better, and some of them do. We know how to build platforms for quick, open and high-quality debate — think of StackExchange — and this could be leveraged by journals. 

11. Deirdre: Nice idea, but why didn’it already get done? And why need journals be involved? 

12. Esteban: The academic system is built on journals, so it should be reformed through journals. But instead of journals run by commercial publishers for profit, we need journals run by scientists for scientists. I have been busy creating and running such journals. 

13. Clementine: Congratulations for all the effort, but why would I publish in your journals? Is your peer review system better or more efficient? 

14. Esteban: Yes it is, because we publish reviewers’ reports on submitted articles. In journals that don’t do that, the reviewers’ work is mostly wasted! Moreover, reports that are meant to be published tend to be more civil and more thorough. 

15. Clementine: More thorough reports? This does not look like a time-saving feature! 

16. Esteban: What saves time is to publish the reports even for articles that are rejected. This way, if the paper is submitted to another journal, peer reviewing does not have to start again from scratch. Reports should be journal-independent, and transferrable between journals: this is sometimes called peer review as a service. 

17. Clementine: Journal-independent reports? How could this make sense, when my main job as a reviewer is to assess the suitability of the submitted paper to a given journal? I do not want to work on each paper as if it was one of those borderline cases where suitability is hard to assess! 

18. Deirdre: Suitability to a journal is not a scientific question. It is of no interest to me. If you want to save time, just ignore all the bad or mediocre papers, that is most of them. 

19. Manuel: But how do we know whether a paper is good? This may be easy for an experienced researcher like you, but what about non-specialists? As a science journalist, I find academic journals useful when they draw my attention to new results that may be worth a story. Researchers write so many papers, I have to rely on journals for filtering and sorting them. 

20. Deirdre: Do you claim that you ignore new results when they first appear as preprints, and patiently wait for their publication in journals months later? 

21. Manuel: Of course not, I have to report on preprints too, but always with a disclaimer that the results have not been peer-reviewed. 

22. Osman: Which means what exactly? That peer-reviewed results are presumed to be true? Is peer review meant to detect errors or even fraud? 

23. Manuel: This is not for me to say. Actually errors, fraud and disputes make for good stories. Reporting on an implausible claim that is published in Nature, reporting on the subsequent debate, and reporting on the eventual retraction: that’s three stories about a single study. 

24. Ingrid: From the reviewer reports I have received on the few papers I submitted so far, I feel that these reviewers would have been unlikely to detect serious mistakes. They just did not spend enough time on it. Some of them only did superficial nitpicking, some others focussed on the list of citations. 

25. Clementine: This does not sound unreasonable, given the good reputation of your senior coauthors. A knowledgeable reviewer would take that into account. Otherwise, there would be too much work! In fact, before peer review became systematic, editors would publish papers from well-known researchers. In 1935, Albert Einstein was outraged when an editor sent his submitted paper to a reviewer. 

26. Ingrid: So, in some cases, the publishing system works like a gentlemen’s club. This is hardly compatible with claims that peer review is fair and thorough. Moreover, I am concerned about bias against outsiders and minorities. Isn’t this why people invented double blind peer review, which masks the authors’ identities from the reviewers? 

27. Esteban: It is hard to hide the author’s identity from reviewers. To lessen bias and improve quality, the answer should rather be more openness. Reviews tend to be more constructive when they are made public, even more so when the reviewer’s name is made public as well. The more transparent the process, the more you can demonstrate that it is fair and thorough, instead of just claiming it. 

28. Deirdre: Which is probably why most journals do not want this openness, as it would show how arbitrary and unfair the process generally is. 

29. Severin: You are simplifying a bit here. The situation varies a lot across disciplines. Researchers have various needs and preferences. Nowadays, publishers offer all kinds of options to the scientists who manage the journals, and experiment with various types of peer review. 

30. Deirdre: Great, but this is also what the free and open source platform OpenReview has been doing since 2013. Publishers are not needed. 

31. Osman: And what is the publishers’ role when it comes to errors or fraud? Do they have any responsibility in that matter? 

32. Severin: Publishers take research integrity very seriously, and pool their efforts via STM — their global trade association — and the Committee on Publication Ethics. Papers undergo thorough peer review pre-publication, and they are corrected or retracted whenever errors are demonstrated post-publication. This is how we guarantee the integrity of the scientific record. 

33. Esteban: That is the theory. In practice however, every time someone seriously tries to reproduce or otherwise check the results of a sample of published papers, a large fraction are found to be erroneous. Retractions are very rare: of the order of one paper out of a thousand . And journals who retract papers generally avoid accusing authors of any wrongdoing, so there is no effect of deterrence. Neither publishers nor universities seem to take publication integrity seriously. 

34. Ingrid: Surely you are thinking of predatory journals, of the type that used to appear in Beall’s list. Reputable, high-impact journals do better, right? 

35. Esteban: Not really. In fact, in a prestigious journal like Nature, editorial decisions can favour spectacular and newsworthy claims, from Benveniste’s memory of water to Dias’ high-temperature superconductivity. Such journals may be less reliable than others, not more. 

36. Osman: So erroneous papers get published far too often. But how could peer review be expected to prevent that? Based on the text of a submitted paper, how can a reviewer check experiments, observations, data analysis, or numerical simulations? 

37. Deirdre: They cannot. In fact, many mistakes are not found by peer reviewers, but by other researchers who try to use the faulty results in further work. Or by readers of the published paper, if they are more attentive or more competent than the reviewers. The biologist Elisabeth Bik has detected hundreds of cases of fraud in published papers, in particular image manipulations. 

38. Osman: And what happens when a mistake is found? 

39. Manuel: Sometimes the mistake is reported to the journal, but corrections and retractions are rare, and take much time. To do post-publication peer review, it is easier and quicker to use the platform PubPeer, which allows anonymous comments. Cases that generate debate and publicity are discussed in blogs such as Retraction watch or For better science. From these stories, it looks like many cases of fraud or mistakes are basic enough to be easily detectable, without replicating experiments or even scrutinizing raw data. 

40. Osman: In brief, peer review is limited in principle, and flawed in practice. Its flaws are not a big problem for research itself, which does not rely much on organized peer review. However, could this be a problem for the management of research, or any other aspects of the system that rely on peer review, if only indirectly? 

41. Clementine: When academics apply for jobs, promotions or funding, their past publications play a crucial role in the decisions, as they should. And people who have to assess the applications often do not study the publications themselves, but rely on the journals they appeared in, and their numbers of citations. These indicators can be misused: for example, the journal impact factor is often misused as an indicator of the quality of individual papers, as has been emphasized by the widely signed but less widely applied San Francisco declaration on research assessment. This does not make these indicators less useful, especially for doing a first vetting of a large number of applicants. Then, when it comes to the shortlist, we can afford to study the details. 

42. Severin: And publications are being used to rate not only individual researchers, but also collaborations, universities, journals, agencies, or even nations. The research system nowadays is vast and multi-layered: to manage it, we need quantitative metrics. These metrics cannot be based on money, since most research has no immediate monetary value. What peer review does is to create tokens of value, and the quantitative evaluation of research is based on these tokens. This can only be imperfect on a case by case basis, but this is nevertheless very useful in aggregate. 

43. Esteban: Maybe, but look at all the side effects and perverse incentives: some academics are buying authorships to get tenure, some receive money for publishing papers in distinguished journals. Publications in specific journals are being presented as deliverables in grant applications, and appear in Gantt charts. Some countries award money to researchers for publishing papers whose titles contain certain keywords. In Serbia, a large pay rise was awarded to the top 10% of researchers, as measured by their numbers of publications and citations. A significant fraction of the published literature is made of fake papers, produced by paper mills, and not based on any research. These side effects show that publication has become a goal in itself, rather than just a channel of communication. The value of a publication in Severin’s sense is too weakly related to the scientific value of the underlying research. By Goodhart’s law, it is therefore doubtful that metrics based on publications are any good at managing research. 

44. Osman: To put it less technically, since scientists have to publish or perish, scientific publishing is a high-stakes activity that involves power and money, and this leads to corruption. But let us not forget that ultimately, science is supposed to be valuable not only to scientists or administrators, but also to the society that supports it. According to Manuel, peer review influences which subjects are reported to the public. Is this influence beneficial? Can peer review tell us which studies are the most valuable to society? 

45. Clementine: Not really, as this is not its aim. Peer review is about scientific value, and is only one step in the long road to useful applications. It is important for the public to understand what peer review is and is not. A commonly used metaphor is to present peer review as a gold standard of scientific rigor, which validates research as reliable and scientifically relevant — no more, no less. 

46. Ingrid: I’m a bit disoriented here. When speaking to fellow scientists, I am discussing fine-grained hierarchies of journals, from predatory journals which perform cursory or even fake peer review, to top-tier publications where peer review is much more rigorous and selective, with all possible nuances in between. When dealing with the public, isn’t the gold standard metaphor too much of a simplification? 

47. Deirdre: Not just a simplification: it amounts to scientists telling the public to blindly trust them, without looking into the matter. And unfortunately, this is sometimes what the public does. The consequences can be serious, as in the case of the Lancet MMR autism fraud: a prestigious journal has published a paper claiming that a vaccine can cause autism, and this has caused a decline in vaccination rates. 

48. Severin: Yes, but at least the paper was retracted, and the lead author was punished. The system worked as it should and corrected the error. 

49. Manuel: Only after twelve years and an investigation by a newspaper, by which time the harm was done. 

50. Osman: Is this case exceptional, or does it tell us something about the role of peer review in making science valuable to the public? 

51. Manuel: As this case illustrates, the prestige and authority of peer-reviewed papers can be used by journalists, politicians or propagandists for their own ends. The peer-reviewed literature is so vast that it is not hard to find papers that support, or seem to support, pretty much any claim. This makes it easy to say that science is on your side, whatever that side is. For example, I can remember a debate on the radio where a climate skeptic insisted that his paper had been published, as if this was supposed to end the discussion on its veracity. In practice, quoting a peer-reviewed paper may not be enough for winning an argument, but it can shift the burden of the proof to the opposite side, even if the quoted paper makes implausible claims, or contradicts other peer-reviewed works. 

52. Esteban: Another problem is that the criterions that are used in peer review have little to do with a scientific work’s value to the public. The criterion of novelty has been blamed for excluding duplications of earlier studies, although such duplications can strengthen or debunk previous claims. Criterions can also exclude negative or null results, i.e. findings that a hypothesis in fact does not hold. However, in public health for example, research that is useful may not be very new or exciting scientifically. 

53. Severin: This is in fact widely recognized to be a problem for science itself, and not just for its value to the public. But it is being addressed: for example, there are now journals of negative results. 

54. Osman: Let us move on to the last question: what should we do with peer review? Strengthen it, overhaul it or ditch it? 

55. Severin: Journals and publishers have shown that they can adapt to a changing, complex world. They are responsive to researchers’ needs, and also propose new tools to make their work more efficient. Peer review plays a fundamental role in the system, as it is the basis for bibliometric data that are precious to scientists and administrators. We can also analyze data about peer review itself, and reviewers can get credit via mechanisms such as Publons. The system is evolving, and does not need radical or ideological reforms. 

56. Esteban: But the costs are too high and rising, bibliometric data are easy to manipulate and ultimately do not mean much, and a few for-profit companies increasingly control how research is done, if not which research is done. 

57. Clementine: At my level at least, the main problem is the heavy workload of peer reviewing. It seems only fair that this work be remunerated, instead of being voluntary as is mostly the case today. Depending on who ultimately pays, this could incite journals to be more sparing with reviewers’ work, and authors to write fewer but better papers. 

58. Esteban: As if there was not already more than enough money in the system! 

59. Ingrid: And what if paying reviewers eventually leads to a corresponding reduction of researchers’ salaries? Eventually, regularly reviewing articles may become practically mandatory. 

60. Esteban: The changes I would favour go in a rather different direction. Peer review should be open in all senses of the term: reports should be publicly accessible, even for rejected papers, anonymity of reviewers should be the exception rather than the rule, and participation in peer review should be open to anyone, not just to invited reviewers. Peer review should focus on scientific questions, rather than suitability to a journal, and should continue after the paper is published, if anything worth discussing comes up. Eventually, the notion of accepting or rejecting papers should be ditched, as was done by eLife. 

61. Ingrid: Isn’t this tantamount to ditching journals themselves? 

62. Esteban: Maybe in the long run. But we start with a system that is controlled by publishers. Their stranglehold on careers can only be relaxed progressively. The main difficulty is not to imagine a better system, but to get there from the existing situation, step by step. 

63. Deirdre: In the case of open access, gradual progress has meant dragging along all the costs and corruption of the old system. If you do likewise with peer review, progress will be slow, and will not lead to systemic changes: existing players will find other ways of controlling careers from bibliometric data. 

64. Osman: Could you then describe the revolution that you are hoping for? 

65. Deirdre: Let us stop pretending that we are seriously reviewing all papers. Peer review should certainly be open, but it should also be completely optional: we should only work on the papers that are worth the effort. We should not try to overhaul journals by subtracting their harmful features, such as editorial decisions, confidentiality, or paywalls. Rather, we should start from preprint archives and add the needed features, starting with readers’ comments. 

66. Clementine: Good luck with convincing scientists to switch from their usual journals to what will surely be a cesspool of bias if not insults! 

67. Ingrid: The cesspool scenario can be avoided. In open platforms like Wikipedia or StackExchange, moderation by the community works rather well. 

68. Manuel: I would like to broaden the discussion beyond papers as we know them — written texts with a fixed list of authors and a version of record. Science also produces data, code, images, videos. There are ephemeral posts on social networks, and databases that are built and curated by many people over the long term. The notion of authorship does not account for the various types of contributions that people make to science: technical, intellectual, organizational, etc. 

69. Severin: Platforms and publishers are indeed busy accommodating all these media, and papers may include statements of authors’ contributions. But what does this have to do with the future of peer review? 

70. Manuel: As long as peer-reviewed papers remain the main drivers of careers, other media will be undervalued. For example, scientists who generate data are incited to analyze those data themselves and write papers, instead of just publishing the data and leaving the analysis to others. The system of research validation and evaluation would probably be quite different, if it was not so focussed on papers and peer review. 

71. Ingrid: But can we realistically hope for significant progress, as long as researchers need to apply for jobs and funding so frequently? This insecurity drives the demand for fine-grained, quantitative research assessments that can be automated. And bibliometrics answer this demand. It has become a leitmotif for Nobel laureates that their prize-winning research would not have been possible in such an environment. We should listen to them, and make funding and positions more stable. 

Acknowledgements

I am grateful to Anton Akhmerov, Matt von Hippel, Louise
Ribault and Pierfrancesco Urbani for helpful suggestions.

Further reading

[1] J. Tennant, et al. (2017 review)
A multi-disciplinary perspective on emergent and future innovations in peer review


[2] A. Mastroianni (2022 blog)
The rise and fall of peer review


[3] T. Roulet, A. Spicer (2014)
Hate the peer-review process? Einstein did too


[4] D. Soergel, A. Saunders, A. McCallum (2013)
Open Scholarship and Peer Review: a Time for Experimentation


[5] J. Brainard, J. You (2018)
What a massive database of retracted papers reveals about science publishing’s “death penalty”


[6] R. Van Noorden (2023)
More than 10,000 research papers were retracted in 2023 a new record


[7] A. Grey, A. Avenell, M. Bolland (2022 blog)
Who Cares About Publication Integrity?


[8] A. Grudniewicz, et al. (2019)
Predatory journals: no definition, no defence


[9] B. Brembs (2018) [doi:10.3389/fnhum.2018.00037]
Prestigious Science Journals Struggle to Reach Even Average Reliability


[10] H. Else, R. Van Noorden (2021)
The fight against fake-paper factories that churn out sham science


[11] L. Schneider (2021)
The Chinese Paper Mill Industry: Interview with Smut Clyde and Tiger BB8


[12] D. Aitkenhead (2013)
Peter Higgs: I wouldn’t be productive enough for today’s academic system
 

 

No comments:

Post a Comment