Tracy Lin

2005

Learning Source-Target Surface Patterns for Web-based Terminology Translation
Jian-Cheng Wu | Tracy Lin | Jason S. Chang
Proceedings of the ACL Interactive Poster and Demonstration Sessions

2004

pdf bib abs

Extraction of name and transliteration in monolingual and parallel corpora
Tracy Lin | Jian-Cheng Wu | Jason S. Chang
Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers

Named-entities in free text represent a challenge to text analysis in Machine Translation and Cross Language Information Retrieval. These phrases are often transliterated into another language with a different sound inventory and writing system. Named-entities found in free text are often not listed in bilingual dictionaries. Although it is possible to identify and translate named-entities on the fly without a list of proper names and transliterations, an extensive list of existing transliterations certainly will ensure high precision rate. We use a seed list of proper names and transliterations to train a Machine Transliteration Model. With the model it is possible to extract proper names and their transliterations in monolingual or parallel corpora with high precision and recall rates.