This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we generate only three BibTeX files per volume, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
Aspect, a linguistic category describing how actions and events unfold over time, is traditionally characterized by three semantic properties: stativity, durativity and telicity. In this study, we investigate whether and to what extent these properties are encoded in the verb token embeddings of the contextualized spaces of two English language models – BERT and GPT-2. First, we propose an experiment using semantic projections to examine whether the values of the vector dimensions of annotated verbs for stativity, durativity and telicity reflect human linguistic distinctions. Second, we use distributional similarity to replicate the notorious Imperfective Paradox described by Dowty (1977), and assess whether the embedding models are sensitive to capture contextual nuances of the verb telicity. Our results show that both models encode the semantic distinctions for the aspect properties of stativity and telicity in most of their layers, while durativity is the most challenging feature. As for the Imperfective Paradox, only the embedding similarities computed with the vectors from the early layers of the BERT model align with the expected pattern.
The field of Distributional Semantics has recently undergone important changes, with the contextual representations produced by Transformers taking the place of static word embeddings models. Noticeably, previous studies comparing the two types of vectors have only focused on the English language and a limited number of models. In our study, we present a comparative evaluation of static and contextualized distributional models for Mandarin Chinese, focusing on a range of intrinsic tasks. Our results reveal that static models remain stronger for some of the classical tasks that consider word meaning independent of context, while contextualized models excel in identifying semantic relations between word pairs and in the categorization of words into abstract semantic classes.
To provide effective support, it is essential for a skilled supporter to emotionally resonate with the help-seeker’s current emotional state. In conversational interactions, this emotional alignment is further influenced by the comforting strategies employed by the supporter. Different strategies guide the interlocutors to align their emotions in nuanced patterns. However, the incorporation of strategy into emotional alignment in the context of emotional support agents remains underexplored. To address this limitation, we propose an improved emotional support agent called Emstremo. Emstremo aims to achieve strategic control of emotional alignment by perceiving and responding to the user’s emotions. Our system’s state-of-the-art performance emphasizes the importance of integrating emotions and strategies in modeling conversations that provide emotional support.
The current study examines how proper names and common nouns in Chinese are cognitively processed during sentence comprehension. EEG data was recorded when participants were presented with neutral contexts followed by either a proper name or a common noun. Proper names in Chinese often consist of characters that can function independently as words or be combined with other characters to form words, potentially benefiting from the semantic features carried by each character. Using cluster-based permutation tests, we found a larger N400 for common nouns when compared to proper names. Our results suggest that the semantics of characters do play a role in facilitating the processing of proper names. This is consistent with previous behavioral findings on noun processing in Chinese, indicating that common nouns require more cognitive resources to process than proper names. Moreover, our results suggest that proper names are processed differently between alphabetic languages and Chinese language.
In psycholinguistics, semantic attraction is a sentence processing phenomenon in which a given argument violates the selectional requirements of a verb, but this violation is not perceived by comprehenders due to its attraction to another noun in the same sentence, which is syntactically unrelated but semantically sound. In our study, we use autoregressive language models to compute the sentence-level and the target phrase-level Surprisal scores of a psycholinguistic dataset on semantic attraction. Our results show that the models are sensitive to semantic attraction, leading to reduced Surprisal scores, although none of them perfectly matches the human behavioral pattern.
Eye-tracking data in Chinese languages present unique challenges due to the non-alphabetic and unspaced nature of the Chinese writing systems. This paper introduces the first deeply-annotated joint Mandarin-Cantonese eye-tracking dataset, from which we achieve a unified eye-tracking prediction system for both language varieties. In addition to the commonly studied first fixation duration and the total fixation duration, this dataset also includes the second fixation duration, expressing fixation patterns that are more relevant to higher-level, structural processing. A basic comparison of the features and measurements in our dataset revealed variation between Mandarin and Cantonese on fixation patterns related to word class and word position. The test of feature usefulness suggested that traditional features are less powerful in predicting the second-pass fixation, to which the linear distance to root makes a leading contribution in Mandarin. In contrast, Cantonese eye-movement behavior relies more on word position and part of speech.
Language researchers have long assumed that concepts can be represented by sets of semantic features, and have traditionally encountered challenges in identifying a feature set that could be sufficiently general to describe the human conceptual experience in its entirety. In the dataset of English norms presented by Binder et al. (2016), also known as Binder norms, the authors introduced a new set of neurobiologically motivated semantic features in which conceptual primitives were defined in terms of modalities of neural information processing. However, no comparable norms are currently available for other languages. In our work, we built the Mandarin Chinese norm by translating the stimuli used in the original study and developed a comparable collection of human ratings for Mandarin Chinese. We also conducted some experiments on the automatic prediction of the Chinese Binder Norms based on the word embeddings of the corresponding words to assess the feasibility of modeling experiential semantic features via corpus-based representations.
As neural language models (NLMs) based on Transformers are becoming increasingly dominant in natural language processing, several studies have proposed analyzing the semantic and pragmatic abilities of such models. In our study, we aimed at investigating the effect of discourse connectives on NLMs with regard to Transformer Surprisal scores by focusing on the English stimuli of an experimental dataset, in which the expectations about an event in a discourse fragment could be reversed by a concessive or a contrastive connective. By comparing the Surprisal scores of several NLMs, we found that bigger NLMs show patterns similar to humans’ behavioral data when a concessive connective is used, while connective-related effects tend to disappear with a contrastive one. We have additionally validated our findings with GPT-Neo using an extended dataset, and results mostly show a consistent pattern.
The paper presents a concise summary of our work for the ML-ESG-2 shared task, exclusively on the Chinese and English datasets. ML-ESG-2 aims to ascertain the influence of news articles on corporations, specifically from an ESG perspective. To this end, we generally explored the capability of key information for impact identification and experimented with various techniques at different levels. For instance, we attempted to incorporate important information at the word level with TF-IDF, at the sentence level with TextRank, and at the document level with summarization. The final results reveal that the one with GPT-4 for summarisation yields the best predictions.
Eye movement data are used in psycholinguistic studies to infer information regarding cognitive processes during reading. In this paper, we describe our proposed method for the Shared Task of Cognitive Modeling and Computational Linguistics (CMCL) 2022 - Subtask 1, which involves data from multiple datasets on 6 languages. We compared different regression models using features of the target word and its previous word, and target word surprisal as regression features. Our final system, using a gradient boosting regressor, achieved the lowest mean absolute error (MAE), resulting in the best system of the competition.
Inalienable possession differs from alienable possession in that, in the former – e.g., kinships and part-whole relations – there is an intrinsic semantic dependency between the possessor and possessum. This paper reports two studies that used acceptability-judgment tasks to investigate whether native Mandarin speakers experienced different levels of interpretational costs while resolving different types of possessive relations, i.e., inalienable possessions (kinship terms and body parts) and alienable ones, expressed within relative clauses. The results show that sentences received higher acceptability ratings when body parts were the possessum as compared to sentences with alienable possessum, indicating that the inherent semantic dependency facilitates the resolution. However, inalienable kinship terms received the lowest acceptability ratings. We argue that this was because the kinship terms, which had the [+human] feature and appeared at the beginning of the experimental sentences, tended to be interpreted as the subject in shallow processing; these features contradicted the semantic-syntactic requirements of the experimental sentences.
In this paper, we describe the system we presented at the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022) regarding the shared task on Lexical Simplification for English, Portuguese, and Spanish. We proposed an unsupervised approach in two steps: First, we used a masked language model with word masking for each language to extract possible candidates for the replacement of a difficult word; second, we ranked the candidates according to three different Transformer-based metrics. Finally, we determined our list of candidates based on the lowest average rank across different metrics.
With the rising popularity of Transformer-based language models, several studies have tried to exploit their masked language modeling capabilities to automatically extract relational linguistic knowledge, although this kind of research has rarely investigated semantic relations in specialized domains. The present study aims at testing a general-domain and a domain-adapted Transformer models on two datasets of financial term-hypernym pairs using the prompt methodology. Our results show that the differences of prompts impact critically on models’ performance, and that domain adaptation on financial text generally improves the capacity of the models to associate the target terms with the right hypernyms, although the more successful models are those retaining a general-domain vocabulary.
With the recent rise in popularity of Transformer models in Natural Language Processing, research efforts have been dedicated to the development of domain-adapted versions of BERT-like architectures. In this study, we focus on FinBERT, a Transformer model trained on text from the financial domain. By comparing its performances with the original BERT on a wide variety of financial text processing tasks, we found continual pretraining from the original model to be the more beneficial option. Domain-specific pretraining from scratch, conversely, seems to be less effective.