Abstract
This paper presents the construction and evaluation of Japanese and English controlled bilingual terminologies that are particularly intended for controlled authoring and machine translation with special reference to the Japanese municipal domain. Our terminologies are constructed by extracting terms from municipal website texts, and the term variations are controlled by defining preferred and proscribed terms for both the source Japanese and the target English. To assess the coverage of the terms/concepts in the municipal domain and validate the quality of the control, we employ a quantitative extrapolation method that estimates the potential vocabulary size. Using Large-Number-of-Rare-Event (LNRE) modelling, we compare two parameters: (1) uncontrolled and controlled and (2) Japanese and English. The results show that our terminologies currently cover about 45–65% of the terms and 50–65% of the concepts in the municipal domain, and are well controlled. The detailed analysis of growth patterns of terminologies also provides insight into the extent to which we can enlarge the terminologies within the realistic range.- Anthology ID:
- W16-4710
- Volume:
- Proceedings of the 5th International Workshop on Computational Terminology (Computerm2016)
- Month:
- December
- Year:
- 2016
- Address:
- Osaka, Japan
- Venue:
- CompuTerm
- SIG:
- Publisher:
- The COLING 2016 Organizing Committee
- Note:
- Pages:
- 83–93
- Language:
- URL:
- https://aclanthology.org/W16-4710
- DOI:
- Cite (ACL):
- Rei Miyata and Kyo Kageura. 2016. Constructing and Evaluating Controlled Bilingual Terminologies. In Proceedings of the 5th International Workshop on Computational Terminology (Computerm2016), pages 83–93, Osaka, Japan. The COLING 2016 Organizing Committee.
- Cite (Informal):
- Constructing and Evaluating Controlled Bilingual Terminologies (Miyata & Kageura, CompuTerm 2016)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/W16-4710.pdf