Yu-Yin Hsu

2022

pdf abs
HkAmsters at CMCL 2022 Shared Task: Predicting Eye-Tracking Data from a Gradient Boosting Framework with Linguistic Features
Lavinia Salicchi | Rong Xiang | Yu-Yin Hsu
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Eye movement data are used in psycholinguistic studies to infer information regarding cognitive processes during reading. In this paper, we describe our proposed method for the Shared Task of Cognitive Modeling and Computational Linguistics (CMCL) 2022 - Subtask 1, which involves data from multiple datasets on 6 languages. We compared different regression models using features of the target word and its previous word, and target word surprisal as regression features. Our final system, using a gradient boosting regressor, achieved the lowest mean absolute error (MAE), resulting in the best system of the competition.

pdf bib
Proceedings of the Workshop on Cognitive Aspects of the Lexicon
Michael Zock | Emmanuele Chersoni | Yu-Yin Hsu | Enrico Santus
Proceedings of the Workshop on Cognitive Aspects of the Lexicon

pdf bib abs
(In)Alienable Possession in Mandarin Relative Clauses
Deran Kong | Yu-Yin Hsu
Proceedings of the Workshop on Cognitive Aspects of the Lexicon

Inalienable possession differs from alienable possession in that, in the former – e.g., kinships and part-whole relations – there is an intrinsic semantic dependency between the possessor and possessum. This paper reports two studies that used acceptability-judgment tasks to investigate whether native Mandarin speakers experienced different levels of interpretational costs while resolving different types of possessive relations, i.e., inalienable possessions (kinship terms and body parts) and alienable ones, expressed within relative clauses. The results show that sentences received higher acceptability ratings when body parts were the possessum as compared to sentences with alienable possessum, indicating that the inherent semantic dependency facilitates the resolution. However, inalienable kinship terms received the lowest acceptability ratings. We argue that this was because the kinship terms, which had the [+human] feature and appeared at the beginning of the experimental sentences, tended to be interpreted as the subject in shallow processing; these features contradicted the semantic-syntactic requirements of the experimental sentences.

pdf bib abs
Discovering Financial Hypernyms by Prompting Masked Language Models
Bo Peng | Emmanuele Chersoni | Yu-Yin Hsu | Chu-Ren Huang
Proceedings of the 4th Financial Narrative Processing Workshop @LREC2022

With the rising popularity of Transformer-based language models, several studies have tried to exploit their masked language modeling capabilities to automatically extract relational linguistic knowledge, although this kind of research has rarely investigated semantic relations in specialized domains. The present study aims at testing a general-domain and a domain-adapted Transformer models on two datasets of financial term-hypernym pairs using the prompt methodology. Our results show that the differences of prompts impact critically on models’ performance, and that domain adaptation on financial text generally improves the capacity of the models to associate the target terms with the right hypernyms, although the more successful models are those retaining a general-domain vocabulary.

pdf abs
PolyU-CBS at TSAR-2022 Shared Task: A Simple, Rank-Based Method for Complex Word Substitution in Two Steps
Emmanuele Chersoni | Yu-Yin Hsu
Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022)

In this paper, we describe the system we presented at the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022) regarding the shared task on Lexical Simplification for English, Portuguese, and Spanish. We proposed an unsupervised approach in two steps: First, we used a masked language model with word masking for each language to extract possible candidates for the replacement of a difficult word; second, we ranked the candidates according to three different Transformer-based metrics. Finally, we determined our list of candidates based on the lowest average rank across different metrics.

2021

pdf
Modeling the Influence of Verb Aspect on the Activation of Typical Event Locations with BERT
Won Ik Cho | Emmanuele Chersoni | Yu-Yin Hsu | Chu-Ren Huang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf abs
Is Domain Adaptation Worth Your Investment? Comparing BERT and FinBERT on Financial Tasks
Bo Peng | Emmanuele Chersoni | Yu-Yin Hsu | Chu-Ren Huang
Proceedings of the Third Workshop on Economics and Natural Language Processing

With the recent rise in popularity of Transformer models in Natural Language Processing, research efforts have been dedicated to the development of domain-adapted versions of BERT-like architectures. In this study, we focus on FinBERT, a Transformer model trained on text from the financial domain. By comparing its performances with the original BERT on a wide variety of financial text processing tasks, we found continual pretraining from the original model to be the more beneficial option. Domain-specific pretraining from scratch, conversely, seems to be less effective.

Yu-Yin Hsu

2022

2021

2018

2012

Co-authors

Venues