Rune Sætre

Also published as: Rune Saetre

2024

A Comprehensive Evaluation of Inductive Reasoning Capabilities and Problem Solving in Large Language Models
Chen Bowen | Rune Sætre | Yusuke Miyao
Findings of the Association for Computational Linguistics: EACL 2024

Inductive reasoning is fundamental to both human and artificial intelligence. The inductive reasoning abilities of current Large Language Models (LLMs) are evaluated in this research.We argue that only considering induction of rules is too narrow and unrealistic, since inductive reasoning is usually mixed with other abilities, like rules application, results/rules validation, and updated information integration.We probed the LLMs with a set of designed symbolic tasks and found that even state-of-the-art (SotA) LLMs fail significantly, showing the inability of LLMs to perform these intuitively simple tasks.Furthermore, we found that perfect accuracy in a small-size problem does not guarantee the same accuracy in a larger-size version of the same problem, provoking the question of how we can assess the LLMs’ actual problem-solving capabilities.We also argue that Chain-of-Thought prompts help the LLMs by decomposing the problem-solving process, but the LLMs still learn limitedly.Furthermore, we reveal that few-shot examples assist LLM generalization in out-of-domain (OOD) cases, albeit limited. The LLM starts to fail when the problem deviates from the provided few-shot examples.

pdf bib abs

An Approach to Co-reference Resolution and Formula Grounding for Mathematical Identifiers Using Large Language Models
Aamin Dev | Takuto Asakura | Rune Sætre
Proceedings of the 2nd Workshop on Mathematical Natural Language Processing @ LREC-COLING 2024

This paper outlines an automated approach to annotate mathematical identifiers in scientific papers — a process historically laborious and costly. We employ state-of-the-art LLMs, including GPT-3.5 and GPT-4, and open-source alternatives to generate a dictionary for annotating mathematical identifiers, linking each identifier to its conceivable descriptions and then assigning these definitions to the respective identifier in- stances based on context. Evaluation metrics include the CoNLL score for co-reference cluster quality and semantic correctness of the annotations.

2017

pdf bib abs

NTNU-1@ScienceIE at SemEval-2017 Task 10: Identifying and Labelling Keyphrases with Conditional Random Fields
Erwin Marsi | Utpal Kumar Sikdar | Cristina Marco | Biswanath Barik | Rune Sætre
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

We present NTNU’s systems for Task A (prediction of keyphrases) and Task B (labelling as Material, Process or Task) at SemEval 2017 Task 10: Extracting Keyphrases and Relations from Scientific Publications (Augenstein et al., 2017). Our approach relies on supervised machine learning using Conditional Random Fields. Our system yields a micro F-score of 0.34 for Tasks A and B combined on the test data. For Task C (relation extraction), we relied on an independently developed system described in (Barik and Marsi, 2017). For the full Scenario 1 (including relations), our approach reaches a micro F-score of 0.33 (5th place). Here we describe our systems, report results and discuss errors.

2016

pdf bib

IDI@NTNU at SemEval-2016 Task 6: Detecting Stance in Tweets Using Shallow Features and GloVe Vectors for Word Representation
Henrik Bøhler | Petter Asla | Erwin Marsi | Rune Sætre
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

Many systems have been developed in the past few years to assist researchers in the discovery of knowledge published as English text, for example in the PubMed database. At the same time, higher level collective knowledge is often published using a graphical notation representing all the entities in a pathway and their interactions. We believe that these pathway visualizations could serve as an effective user interface for knowledge discovery if they can be linked to the text in publications. Since the graphical elements in a Pathway are of a very different nature than their corresponding descriptions in English text, we developed a prototype system called PathText. The goal of PathText is to serve as a bridge between these two different representations. In this paper, we first describe the overall architecture and the interfaces of the PathText system, and then provide some details about the core Text Mining components.

pdf bib

Task-oriented Evaluation of Syntactic Parsers and Their Representations
Yusuke Miyao | Rune Sætre | Kenji Sagae | Takuya Matsuzaki | Jun’ichi Tsujii
Proceedings of ACL-08: HLT

pdf bib

Evaluating the Effects of Treebank Size in a Practical Application for Parsing
Kenji Sagae | Yusuke Miyao | Rune Saetre | Jun’ichi Tsujii
Software Engineering, Testing, and Quality Assurance for Natural Language Processing

pdf bib

Raising the Compatibility of Heterogeneous Annotations: A Case Study on
Yue Wang | Kazuhiro Yoshida | Jin-Dong Kim | Rune Saetre | Jun’ichi Tsujii
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing