Constructing and Evaluating Controlled Bilingual Terminologies

Rei Miyata, Kyo Kageura


Abstract
This paper presents the construction and evaluation of Japanese and English controlled bilingual terminologies that are particularly intended for controlled authoring and machine translation with special reference to the Japanese municipal domain. Our terminologies are constructed by extracting terms from municipal website texts, and the term variations are controlled by defining preferred and proscribed terms for both the source Japanese and the target English. To assess the coverage of the terms/concepts in the municipal domain and validate the quality of the control, we employ a quantitative extrapolation method that estimates the potential vocabulary size. Using Large-Number-of-Rare-Event (LNRE) modelling, we compare two parameters: (1) uncontrolled and controlled and (2) Japanese and English. The results show that our terminologies currently cover about 45–65% of the terms and 50–65% of the concepts in the municipal domain, and are well controlled. The detailed analysis of growth patterns of terminologies also provides insight into the extent to which we can enlarge the terminologies within the realistic range.
Anthology ID:
W16-4710
Volume:
Proceedings of the 5th International Workshop on Computational Terminology (Computerm2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Venue:
CompuTerm
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
83–93
Language:
URL:
https://aclanthology.org/W16-4710
DOI:
Bibkey:
Cite (ACL):
Rei Miyata and Kyo Kageura. 2016. Constructing and Evaluating Controlled Bilingual Terminologies. In Proceedings of the 5th International Workshop on Computational Terminology (Computerm2016), pages 83–93, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Constructing and Evaluating Controlled Bilingual Terminologies (Miyata & Kageura, CompuTerm 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/W16-4710.pdf