This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
JensAllwood
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
Evaluation of automatic language-independent methods for language technology resource creation is difficult, and confounded by a largely unknown quantity, viz. to what extent typological differences among languages are significant for results achieved for one language or language pair to be applicable across languages generally. In the work presented here, as a simplifying assumption, language-independence is taken as axiomatic within certain specified bounds. We evaluate the automatic translation of Roget’s “Thesaurus” from English into Swedish using an independently compiled Roget-style Swedish thesaurus, S.C. Bring’s “Swedish vocabulary arranged into conceptual classes” (1930). Our expectation is that this explicit evaluation of one of the thesaureses created in the MTRoget project will provide a good estimate of the quality of the other thesauruses created using similar methods.
The paper compares how feedback is expressed via speech and head movements in comparable corpora of first encounters in three Nordic languages: Danish, Finnish and Swedish. The three corpora have been collected following common guidelines, and they have been annotated according to the same scheme in the NOMCO project. The results of the comparison show that in this data the most frequent feedback-related head movement is Nod in all three languages. Two types of Nods were distinguished in all corpora: Down-nods and Up-nods; the participants from the three countries use Down- and Up-nods with different frequency. In particular, Danes use Down-nods more frequently than Finns and Swedes, while Swedes use Up-nods more frequently than Finns and Danes. Finally, Finns use more often single Nods than repeated Nods, differing from the Swedish and Danish participants. The differences in the frequency of both Down-nods and Up-Nods in the Danish, Finnish and Swedish interactions are interesting given that Nordic countries are not only geographically near, but are also considered to be very similar culturally. Finally, a comparison of feedback-related words in the Danish and Swedish corpora shows that Swedes and Danes use common feedback words corresponding to yes and no with similar frequency.
This paper presents the multimodal corpora that are being collected and annotated in the Nordic NOMCO project. The corpora will be used to study communicative phenomena such as feedback, turn management and sequencing. They already include video material for Swedish, Danish, Finnish and Estonian, and several social activities are represented. The data will make it possible to verify empirically how gestures (head movements, facial displays, hand gestures and body postures) and speech interact in all the three mentioned aspects of communication. The data are being annotated following the MUMIN annotation scheme, which provides attributes concerning the shape and the communicative functions of head movements, face expressions, body posture and hand gestures. After having described the corpora, the paper discusses how they will be used to study the way feedback is expressed in speech and gestures, and reports results from two pilot studies where we investigated the function of head gestures ― both single and repeated ― in combination with feedback expressions. The annotated corpora will be valuable sources for research on intercultural communication as well as for interaction in the individual languages.
This paper describes past, ongoing and planned work on the collection and transcription of spoken language samples for all the South African official languages and as part of this the training of researchers in corpus linguistic research skills. More specifically the work has involved (and still involves) establishing an international corpus linguistic network linked to a network hub at a UNISA website and the development of research tools, a corpus research guide and workbook for multimodal communication and spoken language corpus research. As an example of the work we are doing and hope to do more of in the future, we present a small pilot study of the influence of English and Afrikaans on the 100 most frequent words in spoken Xhosa as this is evidenced in the corpus of spoken interaction we have gathered so far. Other planned work, besides work on spoken language phenomena, involves comparison of spoken and written language and work on communicative body movements (gestures) and their relation to speech.