Shreya Maurya


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2024

pdf bib
Detecting AI-Generated Text with Pre-Trained Models Using Linguistic Features
Annepaka Yadagiri | Lavanya Shree | Suraiya Parween | Anushka Raj | Shreya Maurya | Partha Pakray
Proceedings of the 21st International Conference on Natural Language Processing (ICON)

The advent of sophisticated large language models, such as ChatGPT and other AI-driven platforms, has led to the generation of text that closely mimics human writing, making it increasingly challenging to discern whether it is human-generated or AI-generated content. This poses significant challenges to content verification, academic integrity, and detecting misleading information. To address these issues, we developed a classification system to differentiate between human-written and AI-generated texts using a diverse HC3-English dataset. This dataset leveraged linguistic analysis and structural features, including part-of-speech tags, vocabulary size, word density, active and passive voice usage, and readability metrics such as Flesch Reading Ease, perplexity, and burstiness. We employed transformer-based and deep-learning models for the classification task, such as CNN_BiLSTM, RNN, BERT, GPT-2, and RoBERTa. Among these, the RoBERTa model demonstrated superior performance, achieving an impressive accuracy of 99.73. These outcomes demonstrate how cutting-edge deep learning methods can maintain information integrity in the digital realm.