Christof Bless
2025
Analyzing the Evolution of Scientific Misconduct Based on the Language of Retracted Papers
Christof Bless
|
Andreas Waldis
|
Angelina Parfenova
|
Maria A. Rodriguez
|
Andreas Marfurt
Proceedings of the Fifth Workshop on Scholarly Document Processing (SDP 2025)
Amid rising numbers of organizations producing counterfeit scholarly articles, it is important to quantify the prevalence of scientific misconduct.We assess the feasibility of automated text-based methods to determine the rate of scientific misconduct by analyzing linguistic differences between retracted and non-retracted papers.We find that retracted works show distinct phrase patterns and higher word repetition.Motivated by this, we evaluatetwo misconduct detection methods, a mixture distribution approach and a Transformer-based one.The best models achieve high accuracy (>0.9 F1) on detection of paper mill articles and automatically generated content, making them viable tools for flagging papers for closer review.We apply the classifiers to more than 300,000 paper abstracts, to quantify misconduct over time and find that our estimation methods accurately reproduce trends observed in the real data.
2017
Exploring Properties of Intralingual and Interlingual Association Measures Visually
Johannes Graën
|
Christof Bless
Proceedings of the 21st Nordic Conference on Computational Linguistics