Comprehensive Plagiarism Detection in Malayalam Texts Through Web and Database Integration
Meharuniza Nazeem, Parvathy Raj, Rajeev R. R, Anitha R, Navaneeth S
Abstract
Plagiarism detection techniques have become essential for recognizing instances of plagiarism, particularly in the domain of academics where scientific papers and documents are of prime importance. We propose an application that offers a comprehensive solution for detecting plagiarism in scholarly articles written in Malayalam, enabling users to submit texts, analyze them for plagiarism, and review the results interactively. With the increasing accessibility of digital content, maintaining originality in academic writing has become more tedious. Our research addresses this challenge by providing a solution tailored to the Malayalam language. The application aids researchers and academic institutions in detecting potential plagiarism by accessing web-based content and algorithmic text analysis. The study significantly contributes to the field of plagiarism detection for low resource language such as malayalam and offers a practical way to preserve the originality of Malayalam scholarly work. The performance of four algorithms SequenceMatcher, N-Grams, Rabin-Karp, and Cosine Similarity is thoroughly evaluated. Cosine Similarity, with a 92.45% detection rate, outperformed the others, significantly surpassing Rabin-Karp(65.3%), N-Grams(58.7%) and SequenceMatcher(51.4%). Using this improved efficiency, a user-friendly web application was developed that integrates web search and database comparison features with the Cosine Similarity algorithm.- Anthology ID:
- 2024.icon-1.40
- Volume:
- Proceedings of the 21st International Conference on Natural Language Processing (ICON)
- Month:
- December
- Year:
- 2024
- Address:
- AU-KBC Research Centre, Chennai, India
- Editors:
- Sobha Lalitha Devi, Karunesh Arora
- Venue:
- ICON
- SIG:
- Publisher:
- NLP Association of India (NLPAI)
- Note:
- Pages:
- 349–356
- Language:
- URL:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.icon-1.40/
- DOI:
- Cite (ACL):
- Meharuniza Nazeem, Parvathy Raj, Rajeev R. R, Anitha R, and Navaneeth S. 2024. Comprehensive Plagiarism Detection in Malayalam Texts Through Web and Database Integration. In Proceedings of the 21st International Conference on Natural Language Processing (ICON), pages 349–356, AU-KBC Research Centre, Chennai, India. NLP Association of India (NLPAI).
- Cite (Informal):
- Comprehensive Plagiarism Detection in Malayalam Texts Through Web and Database Integration (Nazeem et al., ICON 2024)
- PDF:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.icon-1.40.pdf