2023
pdf
abs
BERT Goes Off-Topic: Investigating the Domain Transfer Challenge using Genre Classification
Dmitri Roussinov
|
Serge Sharoff
Findings of the Association for Computational Linguistics: EMNLP 2023
While performance of many text classification tasks has been recently improved due to Pretrained Language Models (PLMs), in this paper we show that they still suffer from a performance gap when the underlying distribution of topics changes. For example, a genre classifier trained on political topics often fails when tested on documents in the same genre, but about sport or medicine. In this work, we quantify this phenomenon empirically with a large corpus and a large set of topics. Thus, we verify that domain transfer remains challenging both for classic PLMs, such as BERT, and for modern large models (LLMs), such as GPT. We develop a data augmentation approach by generating texts in any desired genre and on any desired topic, even when there are no documents in the training corpus that are both in that particular genre and on that particular topic. When we augment the training dataset with the topically-controlled synthetic texts, F1 improves up to 50% for some topics, approaching on-topic training, while showing no or next to no improvement for other topics. While our empirical results focus on genre classification, our methodology is applicable to other classification tasks such as gender, authorship, or sentiment classification.
2020
pdf
abs
Recognizing Semantic Relations by Combining Transformers and Fully Connected Models
Dmitri Roussinov
|
Serge Sharoff
|
Nadezhda Puchnina
Proceedings of the Twelfth Language Resources and Evaluation Conference
Automatically recognizing an existing semantic relation (e.g. “is a”, “part of”, “property of”, “opposite of” etc.) between two words (phrases, concepts, etc.) is an important task affecting many NLP applications and has been subject of extensive experimentation and modeling. Current approaches to automatically telling if a relation exists between two given concepts X and Y can be grouped into two types: 1) those modeling word-paths connecting X and Y in text and 2) those modeling distributional properties of X and Y separately, not necessary in the proximity to each other. Here, we investigate how both types can be improved and combined. We suggest a distributional approach that is based on an attention-based transformer. We have also developed a novel word path model that combines useful properties of a convolutional network with a fully connected language model. While our transformer-based approach works better, both our models significantly outperform the state-of-the-art within their classes of approaches. We also demonstrate that combining the two approaches results in additional gains since they use somewhat different data sources.
2005
pdf
Discretization Based Learning for Information Retrieval
Dmitri Roussinov
|
Weiguo Fan
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing
pdf
Mining Context Specific Similarity Relationships Using The World Wide Web
Dmitri Roussinov
|
Leon J. Zhao
|
Weiguo Fan
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing