Abstract
While idioms are usually very rigid in their expression, they sometimes allow a certain level of freedom in their usage, with modifiers or complements splitting them or being syntactically attached to internal nodes rather than to the root (e.g., “take something with a big grain of salt”). This means that they cannot always be handled as ready-made strings in rule-based natural language generation systems. Having access to the internal syntactic structure of an idiom allows for more subtle processing. We propose a way to enumerate all possible language-independent n-node trees and to map particular idioms of a language onto these generic syntactic patterns. Using this method, we integrate the idioms from the LN-fr into GenDR, a multilingual realizer. Our implementation covers nearly 98% of LN-fr’s idioms with high precision, and can easily be extended or ported to other languages.- Anthology ID:
- 2022.mwe-1.17
- Volume:
- Proceedings of the 18th Workshop on Multiword Expressions @LREC2022
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Archna Bhatia, Paul Cook, Shiva Taslimipoor, Marcos Garcia, Carlos Ramisch
- Venue:
- MWE
- SIG:
- SIGLEX
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 118–126
- Language:
- URL:
- https://aclanthology.org/2022.mwe-1.17
- DOI:
- Cite (ACL):
- Michaelle Dubé and François Lareau. 2022. Handling Idioms in Symbolic Multilingual Natural Language Generation. In Proceedings of the 18th Workshop on Multiword Expressions @LREC2022, pages 118–126, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Handling Idioms in Symbolic Multilingual Natural Language Generation (Dubé & Lareau, MWE 2022)
- PDF:
- https://preview.aclanthology.org/emnlp-22-attachments/2022.mwe-1.17.pdf