Tiago Timponi Torrent

Also published as: Tiago T. Torrent, Tiago Timponi Torrent, Tiago Torrent

2024

pdf
Semantic Permanence in Audiovisual Translation: a FrameNet approach to subtitling
Mairon Samagaio | Tiago Torrent | Ely Matos | Arthur Almeida
Proceedings of the 16th International Conference on Computational Processing of Portuguese

2023

pdf abs
Modeling Construction Grammar’s Way into NLP: Insights from negative results in automatically identifying schematic clausal constructions in Brazilian Portuguese
Arthur Lorenzi | Vânia Gomes de Almeida | Ely Edison Matos | Tiago Timponi Torrent
Proceedings of the First International Workshop on Construction Grammars and NLP (CxGs+NLP, GURT/SyntaxFest 2023)

This paper reports on negative results in a task of automatic identification of schematic clausal constructions and their elements in Brazilian Portuguese. The experiment was set up so as to test whether form and meaning properties of constructions, modeled in terms of Universal Dependencies and FrameNet Frames in a Constructicon, would improve the performance of transformer models in the task. Qualitative analysis of the results indicate that alternatives to the linearization of those properties, dataset size and a post-processing module should be explored in the future as a means to make use of information in Constructicons for NLP tasks.

2022

pdf bib abs
Domain Adaptation in Neural Machine Translation using a Qualia-Enriched FrameNet
Alexandre Diniz da Costa | Mateus Coutinho Marim | Ely Matos | Tiago Timponi Torrent
Proceedings of the Thirteenth Language Resources and Evaluation Conference

In this paper we present Scylla, a methodology for domain adaptation of Neural Machine Translation (NMT) systems that make use of a multilingual FrameNet enriched with qualia relations as an external knowledge base. Domain adaptation techniques used in NMT usually require fine-tuning and in-domain training data, which may pose difficulties for those working with lesser-resourced languages and may also lead to performance decay of the NMT system for out-of-domain sentences. Scylla does not require fine-tuning of the NMT model, avoiding the risk of model over-fitting and consequent decrease in performance for out-of-domain translations. Two versions of Scylla are presented: one using the source sentence as input, and another one using the target sentence. We evaluate Scylla in comparison to a state-of-the-art commercial NMT system in an experiment in which 50 sentences from the Sports domain are translated from Brazilian Portuguese to English. The two versions of Scylla significantly outperform the baseline commercial system in HTER.

pdf abs
Frame Shift Prediction
Zheng Xin Yong | Patrick D. Watson | Tiago Timponi Torrent | Oliver Czulo | Collin Baker
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Frame shift is a cross-linguistic phenomenon in translation which results in corresponding pairs of linguistic material evoking different frames. The ability to predict frame shifts would enable (semi-)automatic creation of multilingual frame annotations and thus speeding up FrameNet creation through annotation projection. Here, we first characterize how frame shifts result from other linguistic divergences such as translational divergences and construal differences. Our analysis also shows that many pairs of frames in frame shifts are multi-hop away from each other in Berkeley FrameNet’s net-like configuration. Then, we propose the Frame Shift Prediction task and demonstrate that our graph attention networks, combined with auxiliary training, can learn cross-linguistic frame-to-frame correspondence and predict frame shifts.

pdf abs
Lutma: A Frame-Making Tool for Collaborative FrameNet Development
Tiago Timponi Torrent | Arthur Lorenzi | Ely Edison Matos | Frederico Belcavello | Marcelo Viridiano | Maucha Andrade Gamonal
Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022

This paper presents Lutma, a collaborative, semi-constrained, tutorial-based tool for contributing frames and lexical units to the Global FrameNet initiative. The tool parameterizes the process of frame creation, avoiding consistency violations and promoting the integration of frames contributed by the community with existing frames. Lutma is structured in a wizard-like fashion so as to provide users with text and video tutorials relevant for each step in the frame creation process. We argue that this tool will allow for a sensible expansion of FrameNet coverage in terms of both languages and cultural perspectives encoded by them, positioning frames as a viable alternative for representing perspective in language models.

This paper argues in favor of the adoption of annotation practices for multimodal datasets that recognize and represent the inherently perspectivized nature of multimodal communication. To support our claim, we present a set of annotation experiments in which FrameNet annotation is applied to the Multi30k and the Flickr 30k Entities datasets. We assess the cosine similarity between the semantic representations derived from the annotation of both pictures and captions for frames. Our findings indicate that: (i) frame semantic similarity between captions of the same picture produced in different languages is sensitive to whether the caption is a translation of another caption or not, and (ii) picture annotation for semantic frames is sensitive to whether the image is annotated in presence of a caption or not.

pdf abs
Charon: A FrameNet Annotation Tool for Multimodal Corpora
Frederico Belcavello | Marcelo Viridiano | Ely Matos | Tiago Timponi Torrent
Proceedings of the 16th Linguistic Annotation Workshop (LAW-XVI) within LREC2022

This paper presents Charon, a web tool for annotating multimodal corpora with FrameNet categories. Annotation can be made for corpora containing both static images and video sequences paired – or not – with text sequences. The pipeline features, besides the annotation interface, corpus import and pre-processing tools.

2020

pdf bib
Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet
Tiago T. Torrent | Collin F. Baker | Oliver Czulo | Kyoko Ohara | Miriam R. L. Petruck
Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet

pdf bib abs
Beyond lexical semantics: notes on pragmatic frames
Oliver Czulo | Alexander Ziem | Tiago Timponi Torrent
Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet

Framenets as an incarnation of frame semantics have been set up to deal with lexicographic issues (cf. Fillmore and Baker 2010, among others). They are thus concerned with lexical units (LUs) and the conceptual structure which categorizes these together. These lexically-evoked frames, however, do not reflect pragmatic properties of constructions (LUs and other types of constructions), such as expressing illocutions or being considered polite or very informal. From the viewpoint of a multilingual annotation effort, the Global FrameNet Shared Annotation Task, we discuss two phenomena, greetings and tag questions, which highlight the necessity both to investigate the role between construction and frame annotation on the one hand and to develop pragmatic frames describing social interactions which are not explicitly lexicalized.

pdf abs
Frame-Based Annotation of Multimodal Corpora: Tracking (A)Synchronies in Meaning Construction
Frederico Belcavello | Marcelo Viridiano | Alexandre Diniz da Costa | Ely Edison da Silva Matos | Tiago Timponi Torrent
Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet

Multimodal aspects of human communication are key in several applications of Natural Language Processing, such as Machine Translation and Natural Language Generation. Despite recent advances in integrating multimodality into Computational Linguistics, the merge between NLP and Computer Vision techniques is still timid, especially when it comes to providing fine-grained accounts for meaning construction. This paper reports on research aiming to determine appropriate methodology and develop a computational tool to annotate multimodal corpora according to a principled structured semantic representation of events, relations and entities: FrameNet. Taking a Brazilian television travel show as corpus, a pilot study was conducted to annotate the frames that are evoked by the audio and the ones that are evoked by visual elements. We also implemented a Multimodal Annotation tool which allows annotators to choose frames and locate frame elements both in the text and in the images, while keeping track of the time span in which those elements are active in each modality. Results suggest that adding a multimodal domain to the linguistic layer of annotation and analysis contributes both to enrich the kind of information that can be tagged in a corpus, and to enhance FrameNet as a model of linguistic cognition.

pdf abs
(Re)construing Meaning in NLP
Sean Trott | Tiago Timponi Torrent | Nancy Chang | Nathan Schneider
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Human speakers have an extensive toolkit of ways to express themselves. In this paper, we engage with an idea largely absent from discussions of meaning in natural language understanding—namely, that the way something is expressed reflects different ways of conceptualizing or construing the information being conveyed. We first define this phenomenon more precisely, drawing on considerable prior work in theoretical cognitive semantics and psycholinguistics. We then survey some dimensions of construed meaning and show how insights from construal could inform theoretical and practical work in NLP.

pdf abs
Semi-supervised Deep Embedded Clustering with Anomaly Detection for Semantic Frame Induction
Zheng Xin Yong | Tiago Timponi Torrent
Proceedings of the Twelfth Language Resources and Evaluation Conference

Although FrameNet is recognized as one of the most fine-grained lexical databases, its coverage of lexical units is still limited. To tackle this issue, we propose a two-step frame induction process: for a set of lexical units not yet present in Berkeley FrameNet data release 1.7, first remove those that cannot fit into any existing semantic frame in FrameNet; then, assign the remaining lexical units to their correct frames. We also present the Semi-supervised Deep Embedded Clustering with Anomaly Detection (SDEC-AD) model—an algorithm that maps high-dimensional contextualized vector representations of lexical units to a low-dimensional latent space for better frame prediction and uses reconstruction error to identify lexical units that cannot evoke frames in FrameNet. SDEC-AD outperforms the state-of-the-art methods in both steps of the frame induction process. Empirical results also show that definitions provide contextual information for representing and characterizing the frame membership of lexical units.

2019

pdf abs
Designing a Frame-Semantic Machine Translation Evaluation Metric
Oliver Czulo | Tiago Timponi Torrent | Ely Edison da Silva Matos | Alexandre Diniz da Costa | Debanjana Kar
Proceedings of the Human-Informed Translation and Interpreting Technology Workshop (HiT-IT 2019)

We propose a metric for machine translation evaluation based on frame semantics which does not require the use of reference translations or human corrections, but is aimed at comparing original and translated output directly. The metrics is described on the basis of an existing manual frame-semantic annotation of a parallel corpus with an English original and a Brazilian Portuguese and a German translation. We discuss implications of our metrics design, including the potential of scaling it for multiple languages.