This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
MariaKhokhlova
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
The aim of the study is to create a “documented” literary and theological history of German Catholic hymnography. The paper focuses on the creation of a corpus of liturgical texts in German and describes the first stage of annotation dealing with the metatextual markup of Catholic hymns. The authors dwell in detail on the parameters of the multi-level classification of hymn texts they developed, which allows them to differentiate hymns on different grounds. The parameters include not only characteristics that represent hymns (the period and the source of their origin, rubrics, musical accompaniment), but also ones that are inherent for strophes. Based on the created markup, it is possible to trace general trends in texts divided according to certain meta-features. The developed scheme of annotation is given on the example of the hymnbook Gotteslob (1975). The results present statistics on different parameters used for hymn description.
The paper evaluates the possibilities of using transformer architecture in creating headlines for news texts in Finnish. The authors statistically analyse the original and generated headlines according to three criteria: informativeness, relevance and impact. The study also substantiates for the first time the effectiveness of a fine-tuned text-to-text transfer transformer model within the task of generating headlines for news articles in Finnish. The results show that there is no statistically significant difference between the scores obtained by the original and generated headlines on the mentioned criteria of informativeness, relevance and impact.
The paper presents the issue of collocability and collocations in Russian and gives a survey of a wide range of dictionaries both printed and online ones that describe collocations. Our project deals with building a database that will include dictionary and statistical collocations. The former can be described in various lexicographic resources whereas the latter can be extracted automatically from corpora. Dictionaries differ among themselves, the information is given in various ways, making it hard for language learners and researchers to acquire data. A number of dictionaries were analyzed and processed to retrieve verified collocations, however the overlap between the lists of collocations extracted from them is still rather small. This fact indicates there is a need to create a unified resource which takes into account collocability and more examples. The proposed resource will also be useful for linguists and for studying Russian as a foreign language. The obtained results can be important for machine learning and for other NLP tasks, for instance, automatic clustering of word combinations and disambiguation.
Without any doubt corpora are vital tools for linguistic studies and solution for applied tasks. Although corpora opportunities are very useful, there is a need of another kind of software for further improvement of linguistic research as it is impossible to process huge amount of linguistic data manually. The Sketch Engine representing itself a corpus tool which takes as input a corpus of any language and corresponding grammar patterns. The paper describes the writing of Sketch grammar for the Russian language as a part of the Sketch Engine system. The system gives information about a words collocability on concrete dependency models, and generates lists of the most frequent phrases for a given word based on appropriate models. The paper deals with two different approaches to writing rules for the grammar, based on morphological information, and also with applying word sketches to the Russian language. The data evidences that such results may find an extensive use in various fields of linguistics, such as dictionary compiling, language learning and teaching, translation (including machine translation), phraseology, information retrieval etc.