I realize I may be stirring the hornet’s nest here, but I wanted to ask for some advice or thoughts on what is causing me to be a bit cynical about research as a whole. As a researcher, I care a lot about “science,” but I’m becoming more skeptical that I can trust the results of any journals (including mainstream journals with a good historical reputation).
Own the surface, the scientific method seems like (and in my view, is) a reliable and trustworthy way of choosing between different theories and models. I often imagine different hypothesis as different buildings, each supported by their own “floors” of assumptions and previous work. The scientific method works by “shaking” each building (hypothesis) and exposing it to stress and possible weak points. Buildings that fall apart are obviously not as strong, while those that remain standing after repeated trials are more trustworthy. As David Hume said, A wise man apportions his beliefs to the evidence,” and similarly scientific theories that have stood up to repeated attempts to falsification deserve more trust.
As great as this methodology is on paper, it is not always practiced, even in respected scientific circles and journals. Funding for repeating experiments is often difficult to obtain (and researchers have a greater incentive to do “original” work instead of reproducing someone else’s work), so subjecting a theory to tests may not happen as often as it should. Furthermore, even when replication is attempted, it is often not successful, calling into question the credibility of theories built on results that cannot or have not been replicated (the reproducibility crisis). In some fields, doing a repeated experiment or even an audit requires access to additional data or code (that is often unpublished), expensive equipment, and access to materials that are not easily obtained.
Double-Blind peer review, once again another great idea in theory, can also be a mess in practice. In 2021, NeurIPS did an experiment where 10% of all papers were resubmitted to an independent set of reviewers. About 57% of papers accepted by one set of reviewers were rejected by the other (and vica-versa). Fortunately (or unfortunately), this result is a reproduction of a 2014 study. This implies that the double-blind review process deciding which papers get accepted into arguably the top machine learning conference, is basically arbitrary (this was the conclusion of both studies).
If we agree that the scientific method and the double-blind review process are useful ways of discovering truth about the world, we should question whether current scientific institutions are actually upholding these principles in a meaningful way. The results of the two above paragraphs, taken together, shouldn’t necessarily decrease our trust in the scientific method, but perhaps should cause us to second guess any theory pushed forward as “science” or in a mainstream scientific journal until these issues are resolved. If results are not independently verified and reviewers are not able to methodically choose between “good” and “bad” work, the scientific community cannot claim its results adhere to the very standards of science it claims to uphold.