Automatic Peer-review Aspect Score Prediction (PASP) of academic papers can be a helpful assistant tool for both reviewers and authors. Most existing works on PASP utilize supervised learning techniques. However, the limited number of peer-review data deteriorates the performance of PASP. This paper presents a novel semi-supervised learning (SSL) method that incorporates the Transformer fine-tuning into the Γ-model, a variant of the Ladder network, to leverage contextual features from unlabeled data. Backpropagation simultaneously minimizes the sum of supervised and unsupervised cost functions, avoiding the need for layer-wise pre-training. The experimental results show that our model outperforms the supervised and naive semi-supervised learning baselines. Our source codes are available online.
Scientific claim verification can help the researchers to easily find the target scientific papers with the sentence evidence from a large corpus for the given claim. Some existing works propose pipeline models on the three tasks of abstract retrieval, rationale selection and stance prediction. Such works have the problems of error propagation among the modules in the pipeline and lack of sharing valuable information among modules. We thus propose an approach, named as ARSJoint, that jointly learns the modules for the three tasks with a machine reading comprehension framework by including claim information. In addition, we enhance the information exchanges and constraints among tasks by proposing a regularization term between the sentence attention scores of abstract retrieval and the estimated outputs of rational selection. The experimental results on the benchmark dataset SciFact show that our approach outperforms the existing works.
Automated Essay Scoring (AES) is a process that aims to alleviate the workload of graders and improve the feedback cycle in educational systems. Multi-task learning models, one of the deep learning techniques that have recently been applied to many NLP tasks, demonstrate the vast potential for AES. In this work, we present an approach for combining two tasks, sentiment analysis, and AES by utilizing multi-task learning. The model is based on a hierarchical neural network that learns to predict a holistic score at the document-level along with sentiment classes at the word-level and sentence-level. The sentiment features extracted from opinion expressions can enhance a vanilla holistic essay scoring, which mainly focuses on lexicon and text semantics. Our approach demonstrates that sentiment features are beneficial for some essay prompts, and the performance is competitive to other deep learning models on the Automated StudentAssessment Prize (ASAP) benchmark. TheQuadratic Weighted Kappa (QWK) is used to measure the agreement between the human grader’s score and the model’s prediction. Ourmodel produces a QWK of 0.763.
Local coherence relation between two phrases/sentences such as cause-effect and contrast gives a strong influence of whether a text is well-structured or not. This paper follows the assumption and presents a method for scoring text clarity by utilizing local coherence between adjacent sentences. We hypothesize that the contextual features of coherence relations learned by utilizing different data from the target training data are also possible to discriminate well-structured of the target text and thus help to score the text clarity. We propose a text clarity scoring method that utilizes local coherence analysis with an out-domain setting, i.e. the training data for the source and target tasks are different from each other. The method with language model pre-training BERT firstly trains the local coherence model as an auxiliary manner and then re-trains it together with clarity text scoring model. The experimental results by using the PeerRead benchmark dataset show the improvement compared with a single model, scoring text clarity model. Our source codes are available online.
The data imbalance problem is a crucial issue for the multi-label text classification. Some existing works tackle it by proposing imbalanced loss objectives instead of the vanilla cross-entropy loss, but their performances remain limited in the cases of extremely imbalanced data. We propose a hybrid solution which adapts general networks for the head categories, and few-shot techniques for the tail categories. We propose a Hybrid-Siamese Convolutional Neural Network (HSCNN) with additional technical attributes, i.e., a multi-task architecture based on Single and Siamese networks; a category-specific similarity in the Siamese structure; a specific sampling method for training HSCNN. The results using two benchmark datasets and three loss objectives show that our method can improve the performance of Single networks with diverse loss objectives on the tail or entire categories.
Machine metaphor understanding is one of the major topics in NLP. Most of the recent attempts consider it as classification or sequence tagging task. However, few types of research introduce the rich linguistic information into the field of computational metaphor by leveraging powerful pre-training language models. We focus a novel reading comprehension paradigm for solving the token-level metaphor detection task which provides an innovative type of solution for this task. We propose an end-to-end deep metaphor detection model named DeepMet based on this paradigm. The proposed approach encodes the global text context (whole sentence), local text context (sentence fragments), and question (query word) information as well as incorporating two types of part-of-speech (POS) features by making use of the advanced pre-training language model. The experimental results by using several metaphor datasets show that our model achieves competitive results in the second shared task on metaphor detection.
In the case of using a deep learning (machine learning) framework for emotion classification, one significant difficulty faced is the requirement of building a large, emotion corpus in which each sentence is assigned emotion labels. As a result, there is a high cost in terms of time and money associated with the construction of such a corpus. Therefore, this paper proposes a method of creating a semi-automatically constructed emotion corpus. For the purpose of this study sentences were mined from Twitter using some emotional seed words that were selected from a dictionary in which the emotion words were well-defined. Tweets were retrieved by one emotional seed word, and the retrieved sentences were assigned emotion labels based on the emotion category of the seed word. It was evident from the findings that the deep learning-based emotion classification model could not achieve high levels of accuracy in emotion classification because the semi-automatically constructed corpus had many errors when assigning emotion labels. In this paper, therefore, an approach for improving the quality of the emotion labels by automatically correcting the errors of emotion labels is proposed and tested. The experimental results showed that the proposed method worked well, and the classification accuracy rate was improved to 55.1% from 44.9% on the Twitter emotion classification task.
Automatic prediction on the peer-review aspect scores of academic papers can be a useful assistant tool for both reviewers and authors. To handle the small size of published datasets on the target aspect of scores, we propose a multi-task approach to leverage additional information from other aspects of scores for improving the performance of the target. Because one of the problems of building multi-task models is how to select the proper resources of auxiliary tasks and how to select the proper shared structures. We propose a multi-task shared structure encoding approach which automatically selects good shared network structures as well as good auxiliary resources. The experiments based on peer-review datasets show that our approach is effective and has better performance on the target scores than the single-task method and naive multi-task methods.
The target outputs of many NLP tasks are word sequences. To collect the data for training and evaluating models, the crowd is a cheaper and easier to access than the oracle. To ensure the quality of the crowdsourced data, people can assign multiple workers to one question and then aggregate the multiple answers with diverse quality into a golden one. How to aggregate multiple crowdsourced word sequences with diverse quality is a curious and challenging problem. People need a dataset for addressing this problem. We thus create a dataset (CrowdWSA2019) which contains the translated sentences generated from multiple workers. We provide three approaches as the baselines on the task of extractive word sequence aggregation. Specially, one of them is an original one we propose which models the reliability of workers. We also discuss some issues on ground truth creation of word sequences which can be addressed based on this dataset.
Distributions of the senses of words are often highly skewed and give a strong influence of the domain of a document. This paper follows the assumption and presents a method for text categorization by leveraging the predominant sense of words depending on the domain, i.e., domain-specific senses. The key idea is that the features learned from predominant senses are possible to discriminate the domain of the document and thus improve the overall performance of text categorization. We propose multi-task learning framework based on the neural network model, transformer, which trains a model to simultaneously categorize documents and predicts a predominant sense for each word. The experimental results using four benchmark datasets show that our method is comparable to the state-of-the-art categorization approach, especially our model works well for categorization of multi-label documents.
We focus on the multi-label categorization task for short texts and explore the use of a hierarchical structure (HS) of categories. In contrast to the existing work using non-hierarchical flat model, the method leverages the hierarchical relations between the pre-defined categories to tackle the data sparsity problem. The lower the HS level, the less the categorization performance. Because the number of training data per category in a lower level is much smaller than that in an upper level. We propose an approach which can effectively utilize the data in the upper levels to contribute the categorization in the lower levels by applying the Convolutional Neural Network (CNN) with a fine-tuning technique. The results using two benchmark datasets show that proposed method, Hierarchical Fine-Tuning based CNN (HFT-CNN) is competitive with the state-of-the-art CNN based methods.
This paper proposes an annotation scheme for the focus of negation in Japanese text. Negation has its scope and the focus within the scope. The scope of negation is the part of the sentence that is negated; the focus is the part of the scope that is most prominently or explicitly negated. In natural language processing, correct interpretation of negated statements requires precise detection of the focus of negation in the statements. As a foundation for developing a negation focus detector for Japanese, we have annotated textdata of “Rakuten Travel: User review data” and the newspaper subcorpus of the “Balanced Corpus of Contemporary Written Japanese” with labels proposed in our annotation scheme. We report 1,327 negation cues and the foci in the corpora, and present classification of these foci based on syntactic types and semantic types. We also propose a system for detecting the focus of negation in Japanese using 16 heuristic rules and report the performance of the system.