Abstract
We study whether novel ideas in biomedical literature appear first in preprints or traditional journals. We develop a Bayesian method to estimate the time of appearance for a phrase in the literature, and apply it to a number of phrases, both automatically extracted and suggested by experts. We see that presently most phrases appear first in the traditional journals, but there is a number of phrases with the first appearance on preprint servers. A comparison of the general composition of texts from bioRxiv and traditional journals shows a growing trend of bioRxiv being predictive of traditional journals. We discuss the application of the method for related problems.- Anthology ID:
- 2020.sdp-1.6
- Volume:
- Proceedings of the First Workshop on Scholarly Document Processing
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Venue:
- sdp
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 42–55
- Language:
- URL:
- https://aclanthology.org/2020.sdp-1.6
- DOI:
- 10.18653/v1/2020.sdp-1.6
- Cite (ACL):
- Swarup Satish, Zonghai Yao, Andrew Drozdov, and Boris Veytsman. 2020. The impact of preprint servers in the formation of novel ideas. In Proceedings of the First Workshop on Scholarly Document Processing, pages 42–55, Online. Association for Computational Linguistics.
- Cite (Informal):
- The impact of preprint servers in the formation of novel ideas (Satish et al., sdp 2020)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2020.sdp-1.6.pdf
- Code
- seasonyao/biorxivimpact