Abstract
With the help of online tools, unscrupulous authors can today generate a pseudo-scientific article and attempt to publish it. Some of these tools work by replacing or paraphrasing existing texts to produce new content, but they have a tendency to generate nonsensical expressions. A recent study introduced the concept of “tortured phrase”, an unexpected odd phrase that appears instead of the fixed expression. E.g. counterfeit consciousness instead of artificial intelligence. The present study aims at investigating how tortured phrases, that are not yet listed, can be detected automatically. We conducted several experiments, including non-neural binary classification, neural binary classification and cosine similarity comparison of the phrase tokens, yielding noticeable results.- Anthology ID:
- 2022.sdp-1.4
- Volume:
- Proceedings of the Third Workshop on Scholarly Document Processing
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Venue:
- sdp
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 32–36
- Language:
- URL:
- https://aclanthology.org/2022.sdp-1.4
- DOI:
- Cite (ACL):
- Puthineath Lay, Martin Lentschat, and Cyril Labbe. 2022. Investigating the detection of Tortured Phrases in Scientific Literature. In Proceedings of the Third Workshop on Scholarly Document Processing, pages 32–36, Gyeongju, Republic of Korea. Association for Computational Linguistics.
- Cite (Informal):
- Investigating the detection of Tortured Phrases in Scientific Literature (Lay et al., sdp 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.sdp-1.4.pdf