Daniel Hardt

2025

Sparks of Pure Competence in LLMs: the Case of Syntactic Center Embedding in English
Daniel Hardt
Proceedings of the Society for Computation in Linguistics 2025

2023

pdf bib abs

Ellipsis-Dependent Reasoning: a New Challenge for Large Language Models
Daniel Hardt
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We propose a novel challenge for large language models: ellipsis-dependent reasoning. We define several structures of paired examples, where an ellipsis example is matched to its non-ellipsis counterpart, and a question is posed which requires resolution of the ellipsis. Test results show that the best models perform well on non-elliptical examples but struggle with all but the simplest ellipsis structures.

2021

pdf bib abs

Ellipsis Resolution as Question Answering: An Evaluation
Rahul Aralikatte | Matthew Lamm | Daniel Hardt | Anders Søgaard
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Most, if not all forms of ellipsis (e.g., so does Mary) are similar to reading comprehension questions (what does Mary do), in that in order to resolve them, we need to identify an appropriate text span in the preceding discourse. Following this observation, we present an alternative approach for English ellipsis resolution relying on architectures developed for question answering (QA). We present both single-task models, and joint models trained on auxiliary QA and coreference resolution datasets, clearly outperforming the current state of the art for Sluice Ellipsis (from 70.00 to 86.01 F1) and Verb Phrase Ellipsis (from 72.89 to 78.66 F1).

pdf bib abs

Universal Joy A Data Set and Results for Classifying Emotions Across Languages
Sotiris Lamprinidis | Federico Bianchi | Daniel Hardt | Dirk Hovy
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

While emotions are universal aspects of human psychology, they are expressed differently across different languages and cultures. We introduce a new data set of over 530k anonymized public Facebook posts across 18 languages, labeled with five different emotions. Using multilingual BERT embeddings, we show that emotions can be reliably inferred both within and across languages. Zero-shot learning produces promising results for low-resource languages. Following established theories of basic emotions, we provide a detailed analysis of the possibilities and limits of cross-lingual emotion classification. We find that structural and typological similarity between languages facilitates cross-lingual learning, as well as linguistic diversity of training data. Our results suggest that there are commonalities underlying the expression of emotion in different languages. We publicly release the anonymized data for future research.

2010

pdf bib abs

Incremental Re-training for Post-editing SMT
Daniel Hardt | Jakob Elming
Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers

A method is presented for incremental re-training of an SMT system, in which a local phrase table is created and incrementally updated as a file is translated and post-edited. It is shown that translation data from within the same file has higher value than other domain-specific data. In two technical domains, within-file data increases BLEU score by several full points. Furthermore, a strong recency effect is documented; nearby data within the file has greater value than more distant data. It is also shown that the value of translation data is strongly correlated with a metric defined over new occurrences of n-grams. Finally, it is argued that the incremental re-training prototype could serve as the basis for a practical system which could be interactively updated in real time in a post-editing setting. Based on the results here, such an interactive system has the potential to dramatically improve translation quality.

Co-authors

Peter Rossen Skadhauge 1

Venues

CL1