MASSIVE Multilingual Abstract Meaning Representation: A Dataset and Baselines for Hallucination Detection

Michael Regan, Shira Wein, George Baker, Emilio Monti


Abstract
Abstract Meaning Representation (AMR) is a semantic formalism that captures the core meaning of an utterance. There has been substantial work developing AMR corpora in English and more recently across languages, though the limited size of existing datasets and the cost of collecting more annotations are prohibitive. With both engineering and scientific questions in mind, we introduce MASSIVE-AMR, a dataset with more than 84,000 text-to-graph annotations, currently the largest and most diverse of its kind: AMR graphs for 1,685 information-seeking utterances mapped to 50+ typologically diverse languages. We describe how we built our resource and its unique features before reporting on experiments using large language models for multilingual AMR and SPARQL parsing as well as applying AMRs for hallucination detection in the context of knowledge base question answering, with results shedding light on persistent issues using LLMs for structured parsing.
Anthology ID:
2024.starsem-1.1
Volume:
Proceedings of the 13th Joint Conference on Lexical and Computational Semantics (*SEM 2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Danushka Bollegala, Vered Shwartz
Venue:
*SEM
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–17
Language:
URL:
https://aclanthology.org/2024.starsem-1.1
DOI:
Bibkey:
Cite (ACL):
Michael Regan, Shira Wein, George Baker, and Emilio Monti. 2024. MASSIVE Multilingual Abstract Meaning Representation: A Dataset and Baselines for Hallucination Detection. In Proceedings of the 13th Joint Conference on Lexical and Computational Semantics (*SEM 2024), pages 1–17, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
MASSIVE Multilingual Abstract Meaning Representation: A Dataset and Baselines for Hallucination Detection (Regan et al., *SEM 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-checklist/2024.starsem-1.1.pdf