Beyond Black Box AI generated Plagiarism Detection: From Sentence to Document Level

Ali Quidwai; Chunhui Li; Parijat Dube

doi:10.18653/v1/2023.bea-1.58

Beyond Black Box AI generated Plagiarism Detection: From Sentence to Document Level

Abstract

The increasing reliance on large language models (LLMs) in academic writing has led to a rise in plagiarism. Existing AI-generated text classifiers have limited accuracy and often produce false positives. We propose a novel approach using natural language processing (NLP) techniques, offering quantifiable metrics at both sentence and document levels for easier interpretation by human evaluators. Our method employs a multi-faceted approach, generating multiple paraphrased versions of a given question and inputting them into the LLM to generate answers. By using a contrastive loss function based on cosine similarity, we match generated sentences with those from the student’s response. Our approach achieves up to 94% accuracy in classifying human and AI text, providing a robust and adaptable solution for plagiarism detection in academic settings. This method improves with LLM advancements, reducing the need for new model training or reconfiguration, and offers a more transparent way of evaluating and detecting AI-generated text.

Anthology ID:: 2023.bea-1.58
Volume:: Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Ekaterina Kochmar, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Nitin Madnani, Anaïs Tack, Victoria Yaneva, Zheng Yuan, Torsten Zesch
Venue:: BEA
SIG:: SIGEDU
Publisher:: Association for Computational Linguistics
Note:
Pages:: 727–735
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2023.bea-1.58/
DOI:: 10.18653/v1/2023.bea-1.58
Bibkey:
Cite (ACL):: Ali Quidwai, Chunhui Li, and Parijat Dube. 2023. Beyond Black Box AI generated Plagiarism Detection: From Sentence to Document Level. In Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), pages 727–735, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Beyond Black Box AI generated Plagiarism Detection: From Sentence to Document Level (Quidwai et al., BEA 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2023.bea-1.58.pdf

PDF Cite Search Fix data