Abstract
The tested method for detection involves utilizing models, trained for differentiating machine-generated text, in order to distinguish between regular and hallucinated sequences. The hypothesis under investigation is that the patterns learned in pretraining will be transferable to the task at hand. The rationale is as follows: the training data of the model is human-written text, therefore deviations from the training set could be detected in this manner.A second method has been added post competition as a further exploration of the dataset involving using the loss of the generation as determined by a pretrained LLM.- Anthology ID:
- 2024.semeval-1.169
- Volume:
- Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1160–1165
- Language:
- URL:
- https://aclanthology.org/2024.semeval-1.169
- DOI:
- 10.18653/v1/2024.semeval-1.169
- Cite (ACL):
- Octavian Brodoceanu. 2024. OctavianB at SemEval-2024 Task 6: An exploration of humanlike qualities of hallucinated LLM texts. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1160–1165, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- OctavianB at SemEval-2024 Task 6: An exploration of humanlike qualities of hallucinated LLM texts (Brodoceanu, SemEval 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.semeval-1.169.pdf