You Zhang


2024

pdf
YNU-HPCC at SIGHAN-2024 dimABSA Task: Using PLMs with a Joint Learning Strategy for Dimensional Intensity Prediction
Wangzehui@stu.ynu.edu.cn Wangzehui@stu.ynu.edu.cn | You Zhang | Jin Wang | Dan Xu | Xuejie Zhang
Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10)

The dimensional approach can represent more fine-grained emotional information than discrete affective states. In this paper, a pretrained language model (PLM) with a joint learning strategy is proposed for the SIGHAN-2024 shared task on Chinese dimensional aspect-based sentiment analysis (dimABSA), which requires submitted models to provide fine-grained multi-dimensional (Valance and Arousal) intensity predictions for given aspects of a review. The proposed model consists of three parts: an input layer that concatenates both given aspect terms and input sentences; a Chinese PLM encoder that generates aspect-specific review representation; and separate linear predictors that jointly predict Valence and Arousal sentiment intensities. Moreover, we merge simplified and traditional Chinese training data for data augmentation. Our systems ranked 2nd place out of 5 participants in subtask 1-intensity prediction. The code is publicly available at https://github.com/WZH5127/2024_subtask1_intensity_prediction.

pdf
Improving Personalized Sentiment Representation with Knowledge-enhanced and Parameter-efficient Layer Normalization
You Zhang | Jin Wang | Liang-Chih Yu | Dan Xu | Xuejie Zhang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Existing studies on personalized sentiment classification consider a document review as an overall text unit and incorporate backgrounds (i.e., user and product information) to learn sentiment representation. However, it is difficult when these methods meet the current pretrained language models (PLMs) owing to quadratic costs that increase with text length and heterogeneous mixes of randomly initialized background information and textual information initialized from well-pretrained checkpoints during information incorporation. To address these problems, we propose a knowledge-enhanced and parameter-efficient layer normalization (E2LN) for efficient and effective review modeling via leveraging LN in transformer structures. Initially, a knowledge base is introduced that stores well-pretrained checkpoints, structured text information, and background information. Based on such a knowledge base, the ability of LN can be magnified as being a crucial component of transformer structure and then improve the performance of PLMs in downstream tasks. Moreover, the proposed E2LN can make PLMs capable of modeling long document reviews and incorporating background information with parameter-efficient fine-tuning and knowledge injecting. Extensive experimental results were obtained for three document-level sentiment classification benchmark datasets. By comparing the results, the effectiveness and efficiency of the proposed model was demonstrated. Code and Data are released at https://github.com/yoyo-yun/E2LN.

2023

pdf
YNU-HPCC at SemEval-2023 Task 6: LEGAL-BERT Based Hierarchical BiLSTM with CRF for Rhetorical Roles Prediction
Yu Chen | You Zhang | Jin Wang | Xuejie Zhang
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

To understand a legal document for real-world applications, SemEval-2023 Task 6 proposes a shared Subtask A, rhetorical roles (RRs) prediction, which requires a system to automatically assign a RR label for each semantical segment in a legal text. In this paper, we propose a LEGAL-BERT based hierarchical BiLSTM model with conditional random field (CRF) for RR prediction, which primarily consists of two parts: word-level and sentence-level encoders. The word-level encoder first adopts a legal-domain pre-trained language model, LEGAL-BERT, initially word-embedding words in each sentence in a document and a word-level BiLSTM further encoding such sentence representation. The sentence-level encoder then uses an attentive pooling method for sentence embedding and a sentence-level BiLSTM for document modeling. Finally, a CRF is utilized to predict RRs for each sentence. The officially released results show that our method outperformed the baseline systems. Our team won 7th rank out of 27 participants in Subtask A.

pdf
Domain Generalization via Switch Knowledge Distillation for Robust Review Representation
You Zhang | Jin Wang | Liang-Chih Yu | Dan Xu | Xuejie Zhang
Findings of the Association for Computational Linguistics: ACL 2023

Applying neural models injected with in-domain user and product information to learn review representations of unseen or anonymous users incurs an obvious obstacle in content-based recommender systems. For the generalization of the in-domain classifier, most existing models train an extra plain-text model for the unseen domain. Without incorporating historical user and product information, such a schema makes unseen and anonymous users dissociate from the recommender system. To simultaneously learn the review representation of both existing and unseen users, this study proposed a switch knowledge distillation for domain generalization. A generalization-switch (GSwitch) model was initially applied to inject user and product information by flexibly encoding both domain-invariant and domain-specific features. By turning the status ON or OFF, the model introduced a switch knowledge distillation to learn a robust review representation that performed well for either existing or anonymous unseen users. The empirical experiments were conducted on IMDB, Yelp-2013, and Yelp-2014 by masking out users in test data as unseen and anonymous users. The comparative results indicate that the proposed method enhances the generalization capability of several existing baseline models. For reproducibility, the code for this paper is available at: https://github.com/yoyo-yun/DG_RRR.

pdf
YNU-HPCC at ROCLING 2023 MultiNER-Health Task: A transformer-based approach for Chinese healthcare NER
Chonglin Pang | You Zhang | Xiaobing Zhou
Proceedings of the 35th Conference on Computational Linguistics and Speech Processing (ROCLING 2023)

pdf
YNU-HPCC at WASSA 2023: Using Text-Mixed Data Augmentation for Emotion Classification on Code-Mixed Text Message
Xuqiao Ran | You Zhang | Jin Wang | Dan Xu | Xuejie Zhang
Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

Emotion classification on code-mixed texts has been widely used in real-world applications. In this paper, we build a system that participates in the WASSA 2023 Shared Task 2 for emotion classification on code-mixed text messages from Roman Urdu and English. The main goal of the proposed method is to adopt a text-mixed data augmentation for robust code-mixed text representation. We mix texts with both multi-label (track 1) and multi-class (track 2) annotations in a unified multilingual pre-trained model, i.e., XLM-RoBERTa, for both subtasks. Our results show that the proposed text-mixed method performs competitively, ranking first in both tracks, achieving an average Macro F1 score of 0.9782 on the multi-label track and of 0.9329 on the multi-class track.

2021

pdf
MA-BERT: Learning Representation by Incorporating Multi-Attribute Knowledge in Transformers
You Zhang | Jin Wang | Liang-Chih Yu | Xuejie Zhang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2018

pdf
YNU-HPCC at SemEval-2018 Task 1: BiLSTM with Attention based Sentiment Analysis for Affect in Tweets
You Zhang | Jin Wang | Xuejie Zhang
Proceedings of the 12th International Workshop on Semantic Evaluation

We implemented the sentiment system in all five subtasks for English and Spanish. All subtasks involve emotion or sentiment intensity prediction (regression and ordinal classification) and emotions determining (multi-labels classification). The useful BiLSTM (Bidirectional Long-Short Term Memory) model with attention mechanism was mainly applied for our system. We use BiLSTM in order to get word information extracted from both directions. The attention mechanism was used to find the contribution of each word for improving the scores. Furthermore, based on BiLSTMATT (BiLSTM with attention mechanism) a few deep-learning algorithms were employed for different subtasks. For regression and ordinal classification tasks we used domain adaptation and ensemble learning methods to leverage base model. While a single base model was used for multi-labels task.

2017

pdf
YNU-HPCC at IJCNLP-2017 Task 5: Multi-choice Question Answering in Exams Using an Attention-based LSTM Model
Hang Yuan | You Zhang | Jin Wang | Xuejie Zhang
Proceedings of the IJCNLP 2017, Shared Tasks

A shared task is a typical question answering task that aims to test how accurately the participants can answer the questions in exams. Typically, for each question, there are four candidate answers, and only one of the answers is correct. The existing methods for such a task usually implement a recurrent neural network (RNN) or long short-term memory (LSTM). However, both RNN and LSTM are biased models in which the words in the tail of a sentence are more dominant than the words in the header. In this paper, we propose the use of an attention-based LSTM (AT-LSTM) model for these tasks. By adding an attention mechanism to the standard LSTM, this model can more easily capture long contextual information.

pdf
YNU-HPCC at EmoInt-2017: Using a CNN-LSTM Model for Sentiment Intensity Prediction
You Zhang | Hang Yuan | Jin Wang | Xuejie Zhang
Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

In this paper, we present a system that uses a convolutional neural network with long short-term memory (CNN-LSTM) model to complete the task. The CNN-LSTM model has two combined parts: CNN extracts local n-gram features within tweets and LSTM composes the features to capture long-distance dependency across tweets. Additionally, we used other three models (CNN, LSTM, BiLSTM) as baseline algorithms. Our introduced model showed good performance in the experimental results.