This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we generate only three BibTeX files per volume, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
Prompt-based learning methods in semi-supervised learning (SSL) settings have been shown to be effective on multiple natural language understanding (NLU) datasets and tasks in the literature. However, manually designing multiple prompts and verbalizers requires domain knowledge and human effort, making it difficult and expensive to scale across different datasets. In this paper, we propose two methods to automatically design multiple prompts and integrate automatic verbalizer in SSL settings without sacrificing performance. The first method uses various demonstration examples with learnable continuous prompt tokens to create diverse prompt models. The second method uses a varying number of soft prompt tokens to encourage language models to learn different prompts. For the verbalizer, we use the prototypical verbalizer to replace the manual one. In summary, we obtained the best average accuracy of 71.5% (a relative improvement of 0.99% over even the previous state-of-the-art SSL method with manual prompts and verbalizers) in different few-shot learning settings.
In today’s interconnected and multilingual world, code-mixing of languages on social media is a common occurrence. While many Natural Language Processing (NLP) tasks like sentiment analysis are mature and well designed for monolingual text, techniques to apply these tasks to code-mixed text still warrant exploration. This paper describes our feature engineering approach to sentiment analysis in code-mixed social media text for SemEval-2020 Task 9: SentiMix. We tackle this problem by leveraging a set of hand-engineered lexical, sentiment, and metadata fea- tures to design a classifier that can disambiguate between “positive”, “negative” and “neutral” sentiment. With this model we are able to obtain a weighted F1 score of 0.65 for the “Hinglish” task and 0.63 for the “Spanglish” tasks.
An author’s way of presenting a story through his/her writing style has a great impact on whether the story will be liked by readers or not. In this paper, we learn representations for authors of literary texts together with representations for character n-grams annotated with their functional roles. We train a neural character n-gram based language model using an external corpus of literary texts and transfer learned representations for use in downstream tasks. We show that augmenting the knowledge from external works of authors produces results competitive with other style-based methods for book likability prediction, genre classification, and authorship attribution.
In this paper, we detail our work on comparing different word-level language identification systems for code-switched Hindi-English data and a standard Spanish-English dataset. In this regard, we build a new code-switched dataset for Hindi-English. To understand the code-switching patterns in these language pairs, we investigate different code-switching metrics. We find that the CRF model outperforms the neural network based models by a margin of 2-5 percentage points for Spanish-English and 3-5 percentage points for Hindi-English.
Folksonomy of movies covers a wide range of heterogeneous information about movies, like the genre, plot structure, visual experiences, soundtracks, metadata, and emotional experiences from watching a movie. Being able to automatically generate or predict tags for movies can help recommendation engines improve retrieval of similar movies, and help viewers know what to expect from a movie in advance. In this work, we explore the problem of creating tags for movies from plot synopses. We propose a novel neural network model that merges information from synopses and emotion flows throughout the plots to predict a set of tags for movies. We compare our system with multiple baselines and found that the addition of emotion flows boosts the performance of the network by learning ≈18% more tags than a traditional machine learning system.
Books have the power to make us feel happiness, sadness, pain, surprise, or sorrow. An author’s dexterity in the use of these emotions captivates readers and makes it difficult for them to put the book down. In this paper, we model the flow of emotions over a book using recurrent neural networks and quantify its usefulness in predicting success in books. We obtained the best weighted F1-score of 69% for predicting books’ success in a multitask setting (simultaneously predicting success and genre of books).
Likability prediction of books has many uses. Readers, writers, as well as the publishing industry, can all benefit from automatic book likability prediction systems. In order to make reliable decisions, these systems need to assimilate information from different aspects of a book in a sensible way. We propose a novel multimodal neural architecture that incorporates genre supervision to assign weights to individual feature types. Our proposed method is capable of dynamically tailoring weights given to feature types based on the characteristics of each book. Our architecture achieves competitive results and even outperforms state-of-the-art for this task.
Although social media has made it easy for people to connect on a virtually unlimited basis, it has also opened doors to people who misuse it to undermine, harass, humiliate, threaten and bully others. There is a lack of adequate resources to detect and hinder its occurrence. In this paper, we present our initial NLP approach to detect invective posts as a first step to eventually detect and deter cyberbullying. We crawl data containing profanities and then determine whether or not it contains invective. Annotations on this data are improved iteratively by in-lab annotations and crowdsourcing. We pursue different NLP approaches containing various typical and some newer techniques to distinguish the use of swear words in a neutral way from those instances in which they are used in an insulting way. We also show that this model not only works for our data set, but also can be successfully applied to different data sets.
Named Entity Recognition for social media data is challenging because of its inherent noisiness. In addition to improper grammatical structures, it contains spelling inconsistencies and numerous informal abbreviations. We propose a novel multi-task approach by employing a more general secondary task of Named Entity (NE) segmentation together with the primary task of fine-grained NE categorization. The multi-task neural network architecture learns higher order feature representations from word and character sequences along with basic Part-of-Speech tags and gazetteer information. This neural network acts as a feature extractor to feed a Conditional Random Fields classifier. We were able to obtain the first position in the 3rd Workshop on Noisy User-generated Text (WNUT-2017) with a 41.86% entity F1-score and a 40.24% surface F1-score.
We investigate the value of feature engineering and neural network models for predicting successful writing. Similar to previous work, we treat this as a binary classification task and explore new strategies to automatically learn representations from book contents. We evaluate our feature set on two different corpora created from Project Gutenberg books. The first presents a novel approach for generating the gold standard labels for the task and the other is based on prior research. Using a combination of hand-crafted and recurrent neural network learned representations in a dual learning setting, we obtain the best performance of 73.50% weighted F1-score.
In this paper, we present our systems for the “SemEval-2017 Task-5 on Fine-Grained Sentiment Analysis on Financial Microblogs and News”. In our system, we combined hand-engineered lexical, sentiment and metadata features, the representations learned from Convolutional Neural Networks (CNN) and Bidirectional Gated Recurrent Unit (Bi-GRU) with Attention model applied on top. With this architecture we obtained weighted cosine similarity scores of 0.72 and 0.74 for subtask-1 and subtask-2, respectively. Using the official scoring system, our system ranked the second place for subtask-2 and eighth place for the subtask-1. It ranked first for both of the subtasks by the scores achieved by an alternate scoring system.
This paper describes our system submission to the CogALex-2016 Shared Task on Corpus-Based Identification of Semantic Relations. Our system won first place for Task-1 and second place for Task-2. The evaluation results of our system on the test set is 88.1% (79.0% for TRUE only) f-measure for Task-1 on detecting semantic similarity, and 76.0% (42.3% when excluding RANDOM) for Task-2 on identifying finer-grained semantic relations. In our experiments, we try word analogy, linear regression, and multi-task Convolutional Neural Networks (CNNs) with word embeddings from publicly available word vectors. We found that linear regression performs better in the binary classification (Task-1), while CNNs have better performance in the multi-class semantic classification (Task-2). We assume that word analogy is more suited for deterministic answers rather than handling the ambiguity of one-to-many and many-to-many relationships. We also show that classifier performance could benefit from balancing the distribution of labels in the training data.