Isaiah Onando Mulang’

Also published as: Isaiah Onando Mulang’

2021

pdf abs
CHOLAN: A Modular Approach for Neural Entity Linking on Wikipedia and Wikidata
Manoj Prabhakar Kannan Ravi | Kuldeep Singh | Isaiah Onando Mulang’ | Saeedeh Shekarpour | Johannes Hoffart | Jens Lehmann
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

In this paper, we propose CHOLAN, a modular approach to target end-to-end entity linking (EL) over knowledge bases. CHOLAN consists of a pipeline of two transformer-based models integrated sequentially to accomplish the EL task. The first transformer model identifies surface forms (entity mentions) in a given text. For each mention, a second transformer model is employed to classify the target entity among a predefined candidates list. The latter transformer is fed by an enriched context captured from the sentence (i.e. local context), and entity description gained from Wikipedia. Such external contexts have not been used in state of the art EL approaches. Our empirical study was conducted on two well-known knowledge bases (i.e., Wikidata and Wikipedia). The empirical results suggest that CHOLAN outperforms state-of-the-art approaches on standard datasets such as CoNLL-AIDA, MSNBC, AQUAINT, ACE2004, and T-REx.

2020

pdf abs
Fine-tuning BERT with Focus Words for Explanation Regeneration
Isaiah Onando Mulang’ | Jennifer D’Souza | Sören Auer
Proceedings of the Ninth Joint Conference on Lexical and Computational Semantics

Explanation generation introduced as the world tree corpus (Jansen et al., 2018) is an emerging NLP task involving multi-hop inference for explaining the correct answer in multiple-choice QA. It is a challenging task evidenced by low state-of-the-art performances(below 60% in F-score) demonstrated on the task. Of the state-of-the-art approaches, fine-tuned transformer-based (Vaswani et al., 2017) BERT models have shown great promise toward continued system performance improvements compared with approaches relying on surface-level cues alone that demonstrate performance saturation. In this work, we take a novel direction by addressing a particular linguistic characteristic of the data — we introduce a novel and lightweight focus feature in the transformer-based model and examine task improvements. Our evaluations reveal a significantly positive impact of this lightweight focus feature achieving the highest scores, second only to a significantly computationally intensive system.

2019

pdf abs
Old is Gold: Linguistic Driven Approach for Entity and Relation Linking of Short Text
Ahmad Sakor | Isaiah Onando Mulang’ | Kuldeep Singh | Saeedeh Shekarpour | Maria Esther Vidal | Jens Lehmann | Sören Auer
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Short texts challenge NLP tasks such as named entity recognition, disambiguation, linking and relation inference because they do not provide sufficient context or are partially malformed (e.g. wrt. capitalization, long tail entities, implicit relations). In this work, we present the Falcon approach which effectively maps entities and relations within a short text to its mentions of a background knowledge graph. Falcon overcomes the challenges of short text using a light-weight linguistic approach relying on a background knowledge graph. Falcon performs joint entity and relation linking of a short text by leveraging several fundamental principles of English morphology (e.g. compounding, headword identification) and utilizes an extended knowledge graph created by merging entities and relations from various knowledge sources. It uses the context of entities for finding relations and does not require training data. Our empirical study using several standard benchmarks and datasets show that Falcon significantly outperforms state-of-the-art entity and relation linking for short text query inventories.

pdf abs
Team SVMrank: Leveraging Feature-rich Support Vector Machines for Ranking Explanations to Elementary Science Questions
Jennifer D’Souza | Isaiah Onando Mulang’ | Sören Auer
Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13)

The TextGraphs 2019 Shared Task on Multi-Hop Inference for Explanation Regeneration (MIER-19) tackles explanation generation for answers to elementary science questions. It builds on the AI2 Reasoning Challenge 2018 (ARC-18) which was organized as an advanced question answering task on a dataset of elementary science questions. The ARC-18 questions were shown to be hard to answer with systems focusing on surface-level cues alone, instead requiring far more powerful knowledge and reasoning. To address MIER-19, we adopt a hybrid pipelined architecture comprising a featurerich learning-to-rank (LTR) machine learning model, followed by a rule-based system for reranking the LTR model predictions. Our system was ranked fourth in the official evaluation, scoring close to the second and third ranked teams, achieving 39.4% MAP.

Co-authors

Jennifer D’Souza 2

Ahmad Sakor 1

Maria Esther Vidal 1

Manoj Prabhakar Kannan Ravi 1

Abhishek Nadgeri 1

Anson Bastos 1

Vijay Saraswat 1