Applying a Dynamic Bayesian Network Framework to Transliteration Identification

Peter Nabende


Abstract
Identification of transliterations is aimed at enriching multilingual lexicons and improving performance in various Natural Language Processing (NLP) applications including Cross Language Information Retrieval (CLIR) and Machine Translation (MT). This paper describes work aimed at using the widely applied graphical models approach of ‘Dynamic Bayesian Networks (DBNs) to transliteration identification. The task of estimating transliteration similarity is not very different from specific identification tasks where DBNs have been successfully applied; it is also possible to adapt DBN models from the other identification domains to the transliteration identification domain. In particular, we investigate the applicability of a DBN framework initially proposed by Filali and Bilmes (2005) to learn edit distance estimation parameters for use in pronunciation classification. The DBN framework enables the specification of a variety of models representing different factors that can affect string similarity estimation. Three DBN models associated with two of the DBN classes originally specified by Filali and Bilmes (2005) have been tested on an experimental set up of Russian-English transliteration identification. Two of the DBN models result in high transliteration identification accuracy and combining the models leads to even much better transliteration identification accuracy.
Anthology ID:
L10-1622
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/906_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Peter Nabende. 2010. Applying a Dynamic Bayesian Network Framework to Transliteration Identification. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Applying a Dynamic Bayesian Network Framework to Transliteration Identification (Nabende, LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/906_Paper.pdf