Tijl De Bie

Also published as: Tijl de Bie, Tijl De Bie


2025

pdf bib
Building Data-Driven Occupation Taxonomies: A Bottom-Up Multi-Stage Approach via Semantic Clustering and Multi-Agent Collaboration
Nan Li | Bo Kang | Tijl De Bie
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track

Creating robust occupation taxonomies, vital for applications ranging from job recommendation to labor market intelligence, is challenging.Manual curation is slow, while existing automated methods are either not adaptive to dynamic regional markets (top-down) or struggle to build coherent hierarchies from noisy data (bottom-up). We introduce CLIMB (CLusterIng-based Multi-agent taxonomy Builder), a framework that fully automates the creation of high-quality, data-driven taxonomies from raw job postings. CLIMB uses global semantic clustering to distill core occupations, then employs a reflection-based multi-agent system to iteratively build a coherent hierarchy. On three diverse, real-world datasets, we show that CLIMB produces taxonomies that are more coherent and scalable than existing methods and successfully capture unique regional characteristics. We release our code and datasets at https://github.com/aida-ugent/CLIMB.

pdf bib
Human-AI Moral Judgment Congruence on Real-World Scenarios: A Cross-Lingual Analysis
Nan Li | Bo Kang | Tijl De Bie
Proceedings of the 9th Widening NLP Workshop

As Large Language Models (LLMs) are deployed in every aspect of our lives, understanding how they reason about moral issues becomes critical for AI safety. We investigate this using a dataset we curated from Reddit’s r/AmItheAsshole, comprising real-world moral dilemmas with crowd-sourced verdicts. Through experiments on five state-of-the-art LLMs across 847 posts, we find a significant and systematic divergence where LLMs are more lenient than humans. Moreover, we find that translating the posts into another language changes LLMs’ verdicts, indicating their judgments lack cross-lingual stability.

2024

pdf bib
TopoLedgerBERT: Topological Learning of Ledger Description Embeddings using Siamese BERT-Networks.
Sander Noels | Sébastien Viaene | Tijl De Bie
Proceedings of the Eighth Financial Technology and Natural Language Processing and the 1st Agent AI for Scenario Planning

2009

pdf bib
Learning to translate: a statistical and computational analysis
Marco Turchi | Tijl de Bie | Nelo Cristianini
Proceedings of the 13th Annual conference of the European Association for Machine Translation

2008

pdf bib
Learning Performance of a Machine Translation System: a Statistical and Computational Analysis
Marco Turchi | Tijl De Bie | Nello Cristianini
Proceedings of the Third Workshop on Statistical Machine Translation