Benjamin Han


Construction of Paired Knowledge Graph - Text Datasets Informed by Cyclic Evaluation
Ali Mousavi | Xin Zhan | He Bai | Peng Shi | Theodoros Rekatsinas | Benjamin Han | Yunyao Li | Jeffrey Pound | Joshua M. Susskind | Natalie Schluter | Ihab F. Ilyas | Navdeep Jaitly
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Datasets that pair Knowledge Graphs (KG) and text together (KG-T) can be used to train forward and reverse neural models that generate text from KG and vice versa. However models trained on datasets where KG and text pairs are not equivalent can suffer from more hallucination and poorer recall. In this paper, we verify this empirically by generating datasets with different levels of noise and find that noisier datasets do indeed lead to more hallucination. We argue that the ability of forward and reverse models trained on a dataset to cyclically regenerate source KG or text is a proxy for the equivalence between the KG and the text in the dataset. Using cyclic evaluation we find that manually created WebNLG is much better than automatically created TeKGen and T-REx. Informed by these observations, we construct a new, improved dataset called LAGRANGE using heuristics meant to improve equivalence between KG and text and show the impact of each of the heuristics on cyclic evaluation. We also construct two synthetic datasets using large language models (LLMs), and observe that these are conducive to models that perform significantly well on cyclic generation of text, but less so on cyclic generation of KGs, probably because of a lack of a consistent underlying ontology.


FLEEK: Factual Error Detection and Correction with Evidence Retrieved from External Knowledge
Farima Fatahi Bayat | Kun Qian | Benjamin Han | Yisi Sang | Anton Belyy | Samira Khorshidi | Fei Wu | Ihab Ilyas | Yunyao Li
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Detecting factual errors of textual information, whether generated by large language models (LLM) or curated by humans, is crucial for making informed decisions. LLMs’ inability to attribute their claims to external knowledge and their tendency to hallucinate makes it difficult to rely on their responses. Humans, too, are prone to factual errors in their writing. Since manual detection and correction of factual er- rors is labor-intensive, developing an automatic approach can greatly reduce human effort. We present a prototype tool that automatically extracts factual claims from text, gathers evidence from external knowledge sources, evaluates the factuality of each claim, and suggests revisions for identified errors using the collected evidence. Initial empirical evaluation on fact error detection (77-85% F1) shows the potential of our tool.


Strategies to Improve Few-shot Learning for Intent Classification and Slot-Filling
Samyadeep Basu | Amr Sharaf | Karine Ip Kiun Chong | Alex Fischer | Vishal Rohra | Michael Amoake | Hazem El-Hammamy | Ehi Nosakhare | Vijay Ramani | Benjamin Han
Proceedings of the Workshop on Structured and Unstructured Knowledge Integration (SUKI)

Intent classification (IC) and slot filling (SF) are two fundamental tasks in modern Natural Language Understanding (NLU) systems. Collecting and annotating large amounts of data to train deep learning models for such systems are not scalable. This problem can be addressed by learning from few examples using fast supervised meta-learning techniques such as prototypical networks. In this work, we systematically investigate how contrastive learning and data augmentation methods can benefit these existing meta-learning pipelines for jointly modelled IC/SF tasks. Through extensive experiments across standard IC/SF benchmarks (SNIPS and ATIS), we show that our proposed approaches outperform standard meta-learning methods: contrastive losses as a regularizer in conjunction with prototypical networks consistently outperform the existing state-of-the-art for both IC and SF tasks, while data augmentation strategies primarily improve few-shot IC by a significant margin


Understanding Temporal Expressions in Emails
Benjamin Han | Donna Gates | Lori Levin
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference


Domain Portability in Speech-to-Speech Translation
Alon Lavie | Lori Levin | Tanja Schultz | Chad Langley | Benjamin Han | Alicia Tribble | Donna Gates | Dorcas Wallace | Kay Peterson
Proceedings of the First International Conference on Human Language Technology Research