Valle Ruíz-Fernández
Also published as: Valle Ruiz-Fernández
2024
BSC-LANGTECH at FIGNEWS 2024 Shared Task: Exploring Semi-Automatic Bias Annotation using Frame Analysis
Valle Ruiz-Fernández
|
José Saiz
|
Aitor Gonzalez-Agirre
Proceedings of The Second Arabic Natural Language Processing Conference
This paper introduces the methodology of BSC-LANGTECH team for the FIGNEWS 2024 Shared Task on News Media Narratives. Following the bias annotation subtask, we apply the theory and methods of framing analysis to develop guidelines to annotate bias in the corpus provided by the task organizators. The manual annotation of a subset, with which a moderate IAA agreement has been achieved, is further used in Deep Learning techniques to explore automatic annotation and test the reliability of our framework.
EsCoLA: Spanish Corpus of Linguistic Acceptability
Nuria Bel
|
Marta Punsola
|
Valle Ruíz-Fernández
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Acceptability is one of the General Language Understanding Evaluation Benchmark (GLUE) probing tasks proposed to assess the linguistic capabilities acquired by a deep-learning transformer-based language model (LM). In this paper, we introduce the Spanish Corpus of Linguistic Acceptability EsCoLA. EsCoLA has been developed following the example of other linguistic acceptability data sets for English, Italian, Norwegian or Russian, with the aim of having a complete GLUE benchmark for Spanish. EsCoLA consists of 11,174 sentences and their acceptability judgements as found in well-known Spanish reference grammars. Additionally, all sentences have been annotated with the class of linguistic phenomenon the sentence is an example of, also following previous practices. We also provide as task baselines the results of fine-tuning four different language models with this data set and the results of a human annotation experiment. Results are also analyzed and commented to guide future research. EsCoLA is released under a CC-BY 4.0 license and freely available at https://doi.org/10.34810/data1138.
Search