2023
pdf
bib
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
Ekaterina Kochmar
|
Jill Burstein
|
Andrea Horbach
|
Ronja Laarmann-Quante
|
Nitin Madnani
|
Anaïs Tack
|
Victoria Yaneva
|
Zheng Yuan
|
Torsten Zesch
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
pdf
abs
The BEA 2023 Shared Task on Generating AI Teacher Responses in Educational Dialogues
Anaïs Tack
|
Ekaterina Kochmar
|
Zheng Yuan
|
Serge Bibauw
|
Chris Piech
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
This paper describes the results of the first shared task on generation of teacher responses in educational dialogues. The goal of the task was to benchmark the ability of generative language models to act as AI teachers, replying to a student in a teacher-student dialogue. Eight teams participated in the competition hosted on CodaLab and experimented with a wide variety of state-of-the-art models, including Alpaca, Bloom, DialoGPT, DistilGPT-2, Flan-T5, GPT- 2, GPT-3, GPT-4, LLaMA, OPT-2.7B, and T5- base. Their submissions were automatically scored using BERTScore and DialogRPT metrics, and the top three among them were further manually evaluated in terms of pedagogical ability based on Tack and Piech (2022). The NAISTeacher system, which ranked first in both automated and human evaluation, generated responses with GPT-3.5 Turbo using an ensemble of prompts and DialogRPT-based ranking of responses for given dialogue contexts. Despite promising achievements of the participating teams, the results also highlight the need for evaluation metrics better suited to educational contexts.
2022
pdf
bib
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)
Ekaterina Kochmar
|
Jill Burstein
|
Andrea Horbach
|
Ronja Laarmann-Quante
|
Nitin Madnani
|
Anaïs Tack
|
Victoria Yaneva
|
Zheng Yuan
|
Torsten Zesch
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)
pdf
abs
FABRA: French Aggregator-Based Readability Assessment toolkit
Rodrigo Wilkens
|
David Alfter
|
Xiaoou Wang
|
Alice Pintard
|
Anaïs Tack
|
Kevin P. Yancey
|
Thomas François
Proceedings of the Thirteenth Language Resources and Evaluation Conference
In this paper, we present the FABRA: readability toolkit based on the aggregation of a large number of readability predictor variables. The toolkit is implemented as a service-oriented architecture, which obviates the need for installation, and simplifies its integration into other projects. We also perform a set of experiments to show which features are most predictive on two different corpora, and how the use of aggregators improves performance over standard feature-based readability prediction. Our experiments show that, for the explored corpora, the most important predictors for native texts are measures of lexical diversity, dependency counts and text coherence, while the most important predictors for foreign texts are syntactic variables illustrating language development, as well as features linked to lexical sophistication. FABRA: have the potential to support new research on readability assessment for French.
2020
pdf
abs
Alector: A Parallel Corpus of Simplified French Texts with Alignments of Misreadings by Poor and Dyslexic Readers
Núria Gala
|
Anaïs Tack
|
Ludivine Javourey-Drevet
|
Thomas François
|
Johannes C. Ziegler
Proceedings of the Twelfth Language Resources and Evaluation Conference
In this paper, we present a new parallel corpus addressed to researchers, teachers, and speech therapists interested in text simplification as a means of alleviating difficulties in children learning to read. The corpus is composed of excerpts drawn from 79 authentic literary (tales, stories) and scientific (documentary) texts commonly used in French schools for children aged between 7 to 9 years old. The excerpts were manually simplified at the lexical, morpho-syntactic, and discourse levels in order to propose a parallel corpus for reading tests and for the development of automatic text simplification tools. A sample of 21 poor-reading and dyslexic children with an average reading delay of 2.5 years read a portion of the corpus. The transcripts of readings errors were integrated into the corpus with the goal of identifying lexical difficulty in the target population. By means of statistical testing, we provide evidence that the manual simplifications significantly reduced reading errors, highlighting that the words targeted for simplification were not only well-chosen but also substituted with substantially easier alternatives. The entire corpus is available for consultation through a web interface and available on demand for research purposes.
2018
pdf
abs
A Report on the Complex Word Identification Shared Task 2018
Seid Muhie Yimam
|
Chris Biemann
|
Shervin Malmasi
|
Gustavo Paetzold
|
Lucia Specia
|
Sanja Štajner
|
Anaïs Tack
|
Marcos Zampieri
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
We report the findings of the second Complex Word Identification (CWI) shared task organized as part of the BEA workshop co-located with NAACL-HLT’2018. The second CWI shared task featured multilingual and multi-genre datasets divided into four tracks: English monolingual, German monolingual, Spanish monolingual, and a multilingual track with a French test set, and two tasks: binary classification and probabilistic classification. A total of 12 teams submitted their results in different task/track combinations and 11 of them wrote system description papers that are referred to in this report and appear in the BEA workshop proceedings.
pdf
abs
NT2Lex: A CEFR-Graded Lexical Resource for Dutch as a Foreign Language Linked to Open Dutch WordNet
Anaïs Tack
|
Thomas François
|
Piet Desmet
|
Cédrick Fairon
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
In this paper, we introduce NT2Lex, a novel lexical resource for Dutch as a foreign language (NT2) which includes frequency distributions of 17,743 words and expressions attested in expert-written textbook texts and readers graded along the scale of the Common European Framework of Reference (CEFR). In essence, the lexicon informs us about what kind of vocabulary should be understood when reading Dutch as a non-native reader at a particular proficiency level. The main novelty of the resource with respect to the previously developed CEFR-graded lexicons concerns the introduction of corpus-based evidence for L2 word sense complexity through the linkage to Open Dutch WordNet (Postma et al., 2016). The resource thus contains, on top of the lemmatised and part-of-speech tagged lexical entries, a total of 11,999 unique word senses and 8,934 distinct synsets.
pdf
abs
Deep Learning Architecture for Complex Word Identification
Dirk De Hertog
|
Anaïs Tack
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
We describe a system for the CWI-task that includes information on 5 aspects of the (complex) lexical item, namely distributional information of the item itself, morphological structure, psychological measures, corpus-counts and topical information. We constructed a deep learning architecture that combines those features and apply it to the probabilistic and binary classification task for all English sets and Spanish. We achieved reasonable performance on all sets with best performances seen on the probabilistic task, particularly on the English news set (MAE 0.054 and F1-score of 0.872). An analysis of the results shows that reasonable performance can be achieved with a single architecture without any domain-specific tweaking of the parameter settings and that distributional features capture almost all of the information also found in hand-crafted features.
2017
pdf
abs
Human and Automated CEFR-based Grading of Short Answers
Anaïs Tack
|
Thomas François
|
Sophie Roekhaut
|
Cédrick Fairon
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications
This paper is concerned with the task of automatically assessing the written proficiency level of non-native (L2) learners of English. Drawing on previous research on automated L2 writing assessment following the Common European Framework of Reference for Languages (CEFR), we investigate the possibilities and difficulties of deriving the CEFR level from short answers to open-ended questions, which has not yet been subjected to numerous studies up to date. The object of our study is twofold: to examine the intricacy involved with both human and automated CEFR-based grading of short answers. On the one hand, we describe the compilation of a learner corpus of short answers graded with CEFR levels by three certified Cambridge examiners. We mainly observe that, although the shortness of the answers is reported as undermining a clear-cut evaluation, the length of the answer does not necessarily correlate with inter-examiner disagreement. On the other hand, we explore the development of a soft-voting system for the automated CEFR-based grading of short answers and draw tentative conclusions about its use in a computer-assisted testing (CAT) setting.
2016
pdf
abs
SVALex: a CEFR-graded Lexical Resource for Swedish Foreign and Second Language Learners
Thomas François
|
Elena Volodina
|
Ildikó Pilán
|
Anaïs Tack
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
The paper introduces SVALex, a lexical resource primarily aimed at learners and teachers of Swedish as a foreign and second language that describes the distribution of 15,681 words and expressions across the Common European Framework of Reference (CEFR). The resource is based on a corpus of coursebook texts, and thus describes receptive vocabulary learners are exposed to during reading activities, as opposed to productive vocabulary they use when speaking or writing. The paper describes the methodology applied to create the list and to estimate the frequency distribution. It also discusses some characteristics of the resulting resource and compares it to other lexical resources for Swedish. An interesting feature of this resource is the possibility to separate the wheat from the chaff, identifying the core vocabulary at each level, i.e. vocabulary shared by several coursebook writers at each level, from peripheral vocabulary which is used by the minority of the coursebook writers.
pdf
abs
Evaluating Lexical Simplification and Vocabulary Knowledge for Learners of French: Possibilities of Using the FLELex Resource
Anaïs Tack
|
Thomas François
|
Anne-Laure Ligozat
|
Cédrick Fairon
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
This study examines two possibilities of using the FLELex graded lexicon for the automated assessment of text complexity in French as a foreign language learning. From the lexical frequency distributions described in FLELex, we derive a single level of difficulty for each word in a parallel corpus of original and simplified texts. We then use this data to automatically address the lexical complexity of texts in two ways. On the one hand, we evaluate the degree of lexical simplification in manually simplified texts with respect to their original version. Our results show a significant simplification effect, both in the case of French narratives simplified for non-native readers and in the case of simplified Wikipedia texts. On the other hand, we define a predictive model which identifies the number of words in a text that are expected to be known at a particular learning level. We assess the accuracy with which these predictions are able to capture actual word knowledge as reported by Dutch-speaking learners of French. Our study shows that although the predictions seem relatively accurate in general (87.4% to 92.3%), they do not yet seem to cover the learners’ lack of knowledge very well.
pdf
abs
Modèles adaptatifs pour prédire automatiquement la compétence lexicale d’un apprenant de français langue étrangère (Adaptive models for automatically predicting the lexical competence of French as a foreign language learners)
Anaïs Tack
|
Thomas François
|
Anne-Laure Ligozat
|
Cédrick Fairon
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 2 : TALN (Articles longs)
Cette étude examine l’utilisation de méthodes d’apprentissage incrémental supervisé afin de prédire la compétence lexicale d’apprenants de français langue étrangère (FLE). Les apprenants ciblés sont des néerlandophones ayant un niveau A2/B1 selon le Cadre européen commun de référence pour les langues (CECR). À l’instar des travaux récents portant sur la prédiction de la maîtrise lexicale à l’aide d’indices de complexité, nous élaborons deux types de modèles qui s’adaptent en fonction d’un retour d’expérience, révélant les connaissances de l’apprenant. En particulier, nous définissons (i) un modèle qui prédit la compétence lexicale de tous les apprenants du même niveau de maîtrise et (ii) un modèle qui prédit la compétence lexicale d’un apprenant individuel. Les modèles obtenus sont ensuite évalués par rapport à un modèle de référence déterminant la compétence lexicale à partir d’un lexique spécialisé pour le FLE et s’avèrent gagner significativement en exactitude (9%-17%).