Choosing the correct paradigm for unknown words in rule-based machine translation systems
V. M. Sánchez-Cartagena, M. Esplà-Gomis, F. Sánchez-Martínez, J. A. Pérez-Ortiz
Abstract
Previous work on an interactive system aimed at helping non-expert users to enlarge the monolingual dictionaries of rule-based machine translation (MT) systems worked by discarding those inflection paradigms that cannot generate a set of inflected word forms validated by the user. This method, however, cannot deal with the common case where a set of different paradigms generate exactly the same set of inflected word forms, although with different inflection information attached. In this paper, we propose the use of an n-gram-based model of lexical categories and inflection information to select a single paradigm in cases where more than one paradigm generates the same set of word forms. Results obtained with a Spanish monolingual dictionary show that the correct paradigm is chosen for around 75% of the unknown words, thus making the resulting system (available under an open-source license) of valuable help to enlarge the monolingual dictionaries used in MT involving non-expert users without technical linguistic knowledge.- Anthology ID:
- 2012.freeopmt-1.4
- Volume:
- Proceedings of the Third International Workshop on Free/Open-Source Rule-Based Machine Translation
- Month:
- June 13-15
- Year:
- 2012
- Address:
- Gothenburg, Sweden
- Editors:
- Cristina España-Bonet, Aarne Ranta
- Venue:
- FreeOpMT
- SIG:
- Publisher:
- Note:
- Pages:
- 27–40
- Language:
- URL:
- https://aclanthology.org/2012.freeopmt-1.4
- DOI:
- Cite (ACL):
- V. M. Sánchez-Cartagena, M. Esplà-Gomis, F. Sánchez-Martínez, and J. A. Pérez-Ortiz. 2012. Choosing the correct paradigm for unknown words in rule-based machine translation systems. In Proceedings of the Third International Workshop on Free/Open-Source Rule-Based Machine Translation, pages 27–40, Gothenburg, Sweden.
- Cite (Informal):
- Choosing the correct paradigm for unknown words in rule-based machine translation systems (Sánchez-Cartagena et al., FreeOpMT 2012)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/2012.freeopmt-1.4.pdf