Daiga Deksne


Tilde’s Machine Translation Systems for WMT 2017
Mārcis Pinnis | Rihards Krišlauks | Toms Miks | Daiga Deksne | Valters Šics
Proceedings of the Second Conference on Machine Translation


Billions of Parallel Words for Free: Building and Using the EU Bookshop Corpus
Raivis Skadiņš | Jörg Tiedemann | Roberts Rozis | Daiga Deksne
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The European Union is a great source of high quality documents with translations into several languages. Parallel corpora from its publications are frequently used in various tasks, machine translation in particular. A source that has not systematically been explored yet is the EU Bookshop ― an online service and archive of publications from various European institutions. The service contains a large body of publications in the 24 official of the EU. This paper describes our efforts in collecting those publications and converting them to a format that is useful for natural language processing in particular statistical machine translation. We report our procedure of crawling the website and various pre-processing steps that were necessary to clean up the data after the conversion from the original PDF files. Furthermore, we demonstrate the use of this dataset in training SMT models for English, French, German, Spanish, and Latvian.


Finite State Morphology Tool for Latvian
Daiga Deksne
Proceedings of the 11th International Conference on Finite State Methods and Natural Language Processing


CFG based grammar checker for Latvian
Daiga Deksne | Raivis Skadiņš
Proceedings of the 18th Nordic Conference of Computational Linguistics (NODALIDA 2011)


Dictionary of Multiword Expressions for Translation into highly Inflected Languages
Daiga Deksne | Raivis Skadiņš | Inguna Skadiņa
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Treatment of Multiword Expressions (MWEs) is one of the most complicated issues in natural language processing, especially in Machine Translation (MT). The paper presents dictionary of MWEs for a English-Latvian MT system, demonstrating a way how MWEs could be handled for inflected languages with rich morphology and rather free word order. The proposed dictionary of MWEs consists of two constituents: a lexicon of phrases and a set of MWE rules. The lexicon of phrases is rather similar to translation lexicon of the MT system, while MWE rules describe syntactic structure of the source and target sentence allowing correct transformation of different MWE types into the target language and ensuring correct syntactic structure. The paper demonstrates this approach on different MWE types, starting from simple syntactic structures, followed by more complicated cases and including fully idiomatic expressions. Automatic evaluation shows that the described approach increases the quality of translation by 0.6 BLEU points.


Comprehension Assistant for Languages of Baltic States
Inguna Skadiņa | Andrejs Vasiļjevs | Daiga Deksne | Raivis Skadiņš | Linda Goldberga
Proceedings of the 16th Nordic Conference of Computational Linguistics (NODALIDA 2007)