Monica Gavrila


2012

pdf
Same domain different discourse style - A case study on Language Resources for data-driven Machine Translation
Monica Gavrila | Walther v. Hahn | Cristina Vertan
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Data-driven machine translation (MT) approaches became very popular during last years, especially for language pairs for which it is difficult to find specialists to develop transfer rules. Statistical (SMT) or example-based (EBMT) systems can provide reasonable translation quality for assimilation purposes, as long as a large amount of training data is available. Especially SMT systems rely on parallel aligned corpora which have to be statistical relevant for the given language pair. The construction of large domain specific parallel corpora is time- and cost-consuming; the current practice relies on one or two big such corpora per language pair. Recent developed strategies ensure certain portability to other domains through specialized lexicons or small domain specific corpora. In this paper we discuss the influence of different discourse styles on statistical machine translation systems. We investigate how a pure SMT performs when training and test data belong to same domain but the discourse style varies.

2011

pdf
Training Data in Statistical Machine Translation - the More, the Better?
Monica Gavrila | Cristina Vertan
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

pdf
Experiments with Small-size Corpora in CBMT
Monica Gavrila | Natalia Elita
Proceedings of the Second Student Research Workshop associated with RANLP 2011

pdf
Constrained Recombination in an Example-based Machine Translation System
Monica Gavrila
Proceedings of the 15th Annual conference of the European Association for Machine Translation

pdf
Using Manual and Parallel Aligned Corpora for Machine Translation Services within an On-line Content Management System
Cristina Vertan | Monica Gavrila
Proceedings of the Second Workshop on Annotation and Exploitation of Parallel Corpora

2009

pdf
Using JRC-ACQUIS in SMT Experiments for Romanian and German
Monica Gavrila
Proceedings of the Workshop Multilingual resources, technologies and evaluation for central and Eastern European languages

pdf
ProLiV - a Tool for Teaching by Viewing Computational Linguistics
Monica Gavrila | Cristina Vertan
Proceedings of the ACL-IJCNLP 2009 Software Demonstrations

2005

pdf
MANAGELEX and the Semantic Web
Monica Gavrila | Cristina Vertan
Proceedings of OntoLex 2005 - Ontologies and Lexical Resources