M. Dolores Jiménez-López

Also published as: Maria Dolores Jiménez-López, M. Dolores Jiménez López

2025

pdf bib abs
Beyond the Data: The Impact of Annotation Inconsistencies in UD Treebanks on Typological Universals and Complexity Assessment
Antoni Brosa Rodríguez | M. Dolores Jiménez López
Proceedings of the 7th Workshop on Research in Computational Linguistic Typology and Multilingual NLP

This study explores the impact of annotation inconsistencies in Universal Dependencies (UD) treebanks on typological research in computational linguistics. UD provides a standardized framework for cross-linguistic annotation, facilitating large-scale empirical studies on linguistic diversity and universals. However, despite rigorous guidelines, annotation inconsistencies persist across treebanks. The objective of this paper is to assess how these inconsistencies affect typological universals, linguistic descriptions, and complexity metrics. We analyze systematic annotation errors in multiple UD treebanks, focusing on morphological features. Case studies on Spanish and Dutch demonstrate how differing annotation decisions within the same language create contradictory typological profiles. We classify the errors into two main categories: overgeneration errors (features incorrectly annotated, since do not actually exist in a language) and data omission errors (inconsistent or incomplete annotation of features that do exist). Our results show that these inconsistencies significantly distort typological analyses, leading to false generalizations and miscalculations of linguistic complexity. We propose methodological safeguards for typological research using UD data. Our findings highlight the need for methodological improvements to ensure more reliable cross-linguistic generalizations in computational typology.

2018

pdf bib
Proceedings of the Workshop on Linguistic Complexity and Natural Language Processing
Leonor Becerra-Bonache | M. Dolores Jiménez-López | Carlos Martín-Vide | Adrià Torrens-Urrutia
Proceedings of the Workshop on Linguistic Complexity and Natural Language Processing

pdf bib abs
A Gold Standard to Measure Relative Linguistic Complexity with a Grounded Language Learning Model
Leonor Becerra-Bonache | Henning Christiansen | M. Dolores Jiménez-López
Proceedings of the Workshop on Linguistic Complexity and Natural Language Processing

This paper focuses on linguistic complexity from a relative perspective. It presents a grounded language learning system that can be used to study linguistic complexity from a developmental point of view and introduces a tool for generating a gold standard in order to evaluate the performance of the learning system. In general, researchers agree that it is more feasible to approach complexity from an objective or theory-oriented viewpoint than from a subjective or user-related point of view. Studies that have adopted a relative complexity approach have showed some preferences for L2 learners. In this paper, we try to show that computational models of the process of language acquisition may be an important tool to consider children and the process of first language acquisition as suitable candidates for evaluating the complexity of languages.

2016

pdf bib abs
Could Machine Learning Shed Light on Natural Language Complexity?
Maria Dolores Jiménez-López | Leonor Becerra-Bonache
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)

In this paper, we propose to use a subfield of machine learning –grammatical inference– to measure linguistic complexity from a developmental point of view. We focus on relative complexity by considering a child learner in the process of first language acquisition. The relevance of grammatical inference models for measuring linguistic complexity from a developmental point of view is based on the fact that algorithms proposed in this area can be considered computational models for studying first language acquisition. Even though it will be possible to use different techniques from the field of machine learning as computational models for dealing with linguistic complexity -since in any model we have algorithms that can learn from data-, we claim that grammatical inference models offer some advantages over other tools.

2006

pdf bib abs
Ambiguous Turn-Taking Games in Conversations
Gemma Bel-Enguix | Maria Dolores Jiménez-López
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Posters

Human-computer interfaces require models of dialogue structure that capture the variability and unpredictability within dialogue. Semantic and pragmatic context are continuously evolving during conversation, especially by the distribution of turns that have a direct effect in dialogue exchanges. In this paper we use a formal language paradigm for modelling multi-agent system conversations. Our computational model combines pragmatic minimal units –speech acts– for constructing dialogues. In this framework, we show how turn-taking distribution can be ambiguous and propose an algorithm for solving it, considering turn coherence, trajectories and turn pairing. Finally, we suggest overlapping as one of the possible phenomena emerging from an unresolved turn-taking.

Co-authors

Adrià Torrens Urrutia 1

Venues

Fix author