Extraction of Cross Language Term Correspondences

Hans Hjelm


Abstract
This paper describes a method for extracting translations of terms across languages, using parallel corpora. The extracted term correspondences are such that they are useful when performing query expansion for cross language information retrieval, or for bilingual lexicon extraction. The method makes use of the mutual information measure and allows for mapping between single word- to multi-word terms and vice versa. The method is scalable (accommodates addition or removal of data) and produces high quality results, while keeping the computational costs low enough for allowing on-the-fly translations in e.g., cross language information retrieval systems. The work was carried out in collaboration with Intrafind Software AG (Munich, Germany).
Anthology ID:
L06-1209
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/356_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Hans Hjelm. 2006. Extraction of Cross Language Term Correspondences. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Extraction of Cross Language Term Correspondences (Hjelm, LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/356_pdf.pdf