This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we generate only three BibTeX files per volume, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
NLP models are vulnerable to data poisoning attacks. One type of attack can plant a backdoor in a model by injecting poisoned examples in training, causing the victim model to misclassify test instances which include a specific pattern. Although defences exist to counter these attacks, they are specific to an attack type or pattern. In this paper, we propose a generic defence mechanism by making the training process robust to poisoning attacks through gradient shaping methods, based on differentially private training. We show that our method is highly effective in mitigating, or even eliminating, poisoning attacks on text classification, with only a small cost in predictive accuracy.
Organisations are monitoring their Social License to Operate (SLO) with increasing regularity. SLO, the level of support organisations gain from the public, is typically assessed through surveys or focus groups, which require expensive manual efforts and yield quickly-outdated results. In this paper, we present SIRTA (Social Insight via Real-Time Text Analytics), a novel real-time text analytics system for assessing and monitoring organisations’ SLO levels by analysing the public discourse from social posts. To assess SLO levels, our insight is to extract and transform peoples’ stances towards an organisation into SLO levels. SIRTA achieves this by performing a chain of three text classification tasks, where it identifies task-relevant social posts, discovers key SLO risks discussed in the posts, and infers stances specific to the SLO risks. We leverage recent language understanding techniques (e.g., BERT) for building our classifiers. To monitor SLO levels over time, SIRTA employs quality control mechanisms to reliably identify SLO trends and variations of multiple organisations in a market. These are derived from the smoothed time series of their SLO levels based on exponentially-weighted moving average (EWMA) calculation. Our experimental results show that SIRTA is highly effective in distilling stances from social posts for SLO level assessment, and that the continuous monitoring of SLO levels afforded by SIRTA enables the early detection of critical SLO changes.
We identify agreement and disagreement between utterances that express stances towards a topic of discussion. Existing methods focus mainly on conversational settings, where dialogic features are used for (dis)agreement inference. We extend this scope and seek to detect stance (dis)agreement in a broader setting, where independent stance-bearing utterances, which prevail in many stance corpora and real-world scenarios, are compared. To cope with such non-dialogic utterances, we find that the reasons uttered to back up a specific stance can help predict stance (dis)agreements. We propose a reason comparing network (RCN) to leverage reason information for stance comparison. Empirical results on a well-known stance corpus show that our method can discover useful reason information, enabling it to outperform several baselines in stance (dis)agreement detection.
In stance classification, the target on which the stance is made defines the boundary of the task, and a classifier is usually trained for prediction on the same target. In this work, we explore the potential for generalizing classifiers between different targets, and propose a neural model that can apply what has been learned from a source target to a destination target. We show that our model can find useful information shared between relevant targets which improves generalization in certain scenarios.