Maria Soledad Pera


2021

pdf bib
Spellchecking for Children in Web Search: a Natural Language Interface Case-study
Casey Kennington | Jerry Alan Fails | Katherine Landau Wright | Maria Soledad Pera
Proceedings of the First Workshop on Bridging Human–Computer Interaction and Natural Language Processing

Given the more widespread nature of natural language interfaces, it is increasingly important to understand who are accessing those interfaces, and how those interfaces are being used. In this paper, we explore spellchecking in the context of web search with children as the target audience. In particular, via a literature review we show that, while widely used, popular search tools are ill-designed for children. We then use spellcheckers as a case study to highlight the need for an interdisciplinary approach that brings together natural language processing, education, human-computer interaction to address a known information retrieval problem: query misspelling. We conclude that it is imperative that those for whom the interfaces are designed have a voice in the design process.

2020

pdf
Hierarchical Mapping for Crosslingual Word Embedding Alignment
Ion Madrazo Azpiazu | Maria Soledad Pera
Transactions of the Association for Computational Linguistics, Volume 8

The alignment of word embedding spaces in different languages into a common crosslingual space has recently been in vogue. Strategies that do so compute pairwise alignments and then map multiple languages to a single pivot language (most often English). These strategies, however, are biased towards the choice of the pivot language, given that language proximity and the linguistic characteristics of the target language can strongly impact the resultant crosslingual space in detriment of topologically distant languages. We present a strategy that eliminates the need for a pivot language by learning the mappings across languages in a hierarchical way. Experiments demonstrate that our strategy significantly improves vocabulary induction scores in all existing benchmarks, as well as in a new non-English–centered benchmark we built, which we make publicly available.

2019

pdf
Multiattentive Recurrent Neural Network Architecture for Multilingual Readability Assessment
Ion Madrazo Azpiazu | Maria Soledad Pera
Transactions of the Association for Computational Linguistics, Volume 7

We present a multiattentive recurrent neural network architecture for automatic multilingual readability assessment. This architecture considers raw words as its main input, but internally captures text structure and informs its word attention process using other syntax- and morphology-related datapoints, known to be of great importance to readability. This is achieved by a multiattentive strategy that allows the neural network to focus on specific parts of a text for predicting its reading level. We conducted an exhaustive evaluation using data sets targeting multiple languages and prediction task types, to compare the proposed model with traditional, state-of-the-art, and other neural network strategies.