OctavianB at SemEval-2024 Task 6: An exploration of humanlike qualities of hallucinated LLM texts

Octavian Brodoceanu


Abstract
The tested method for detection involves utilizing models, trained for differentiating machine-generated text, in order to distinguish between regular and hallucinated sequences. The hypothesis under investigation is that the patterns learned in pretraining will be transferable to the task at hand. The rationale is as follows: the training data of the model is human-written text, therefore deviations from the training set could be detected in this manner.A second method has been added post competition as a further exploration of the dataset involving using the loss of the generation as determined by a pretrained LLM.
Anthology ID:
2024.semeval-1.169
Volume:
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
1160–1165
Language:
URL:
https://aclanthology.org/2024.semeval-1.169
DOI:
Bibkey:
Cite (ACL):
Octavian Brodoceanu. 2024. OctavianB at SemEval-2024 Task 6: An exploration of humanlike qualities of hallucinated LLM texts. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1160–1165, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
OctavianB at SemEval-2024 Task 6: An exploration of humanlike qualities of hallucinated LLM texts (Brodoceanu, SemEval 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-checklist/2024.semeval-1.169.pdf
Supplementary material:
 2024.semeval-1.169.SupplementaryMaterial.tex
Supplementary material:
 2024.semeval-1.169.SupplementaryMaterial.txt