Cosmin Munteanu


A Taxonomical NLP Blueprint to Support Financial Decision Making through Information-Centred Interactions
Siavash Kazemian | Cosmin Munteanu | Gerald Penn
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)

Investment management professionals (IMPs) often make decisions after manual analysis of text transcripts of central banks’ conferences or companies’ earning calls. Their current software tools, while interactive, largely leave users unassisted in using these transcripts. A key component to designing speech and NLP techniques for this community is to qualitatively characterize their perceptions of AI as well as their legitimate needs so as to (1) better apply existing NLP methods, (2) direct future research and (3) correct IMPs’ perceptions of what AI is capable of. This paper presents such a study, through a contextual inquiry with eleven IMPs, uncovering their information practices when using such transcripts. We then propose a taxonomy of user requirements and usability criteria to support IMP decision making, and validate the taxonomy through participatory design workshops with four IMPs. Our investigation suggests that: (1) IMPs view visualization methods and natural language processing algorithms primarily as time-saving tools that are incapable of enhancing either discovery or interpretation and (2) their existing software falls well short of the state of the art in both visualization and NLP.


FAB: The French Absolute Beginner Corpus for Pronunciation Training
Sean Robertson | Cosmin Munteanu | Gerald Penn
Proceedings of the Twelfth Language Resources and Evaluation Conference

We introduce the French Absolute Beginner (FAB) speech corpus. The corpus is intended for the development and study of Computer-Assisted Pronunciation Training (CAPT) tools for absolute beginner learners. Data were recorded during two experiments focusing on using a CAPT system in paired role-play tasks. The setting grants FAB three distinguishing features from other non-native corpora: the experimental setting is ecologically valid, closing the gap between training and deployment; it features a label set based on teacher feedback, allowing for context-sensitive CAPT; and data have been primarily collected from absolute beginners, a group often ignored. Participants did not read prompts, but instead recalled and modified dialogues that were modelled in videos. Unable to distinguish modelled words solely from viewing videos, speakers often uttered unintelligible or out-of-L2 words. The corpus is split into three partitions: one from an experiment with minimal feedback; another with explicit, word-level feedback; and a third with supplementary read-and-record data. A subset of words in the first partition has been labelled as more or less native, with inter-annotator agreement reported. In the explicit feedback partition, labels are derived from the experiment’s online feedback. The FAB corpus is scheduled to be made freely available by the end of 2020.


Ecological Validity and the Evaluation of Speech Summarization Quality
Anthony McCallum | Cosmin Munteanu | Gerald Penn | Xiaodan Zhu
Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization


Improving Automatic Speech Recognition for Lectures through Transformation-based Rules Learned from Minimal Data
Cosmin Munteanu | Gerald Penn | Xiaodan Zhu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP


Optimizing Typed Feature Structure Grammar Parsing through Non-Statistical Indexing
Cosmin Munteanu | Gerald Penn
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)


A Tabulation-Based Parsing Method that Reduces Copying
Gerald Penn | Cosmin Munteanu
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

Indexing methods for efficient parsing
Cosmin Munteanu
Proceedings of the HLT-NAACL 2003 Student Research Workshop


MDWOZ: A Wizard of Oz Environment for Dialog Systems Development
Cosmin Munteanu | Marian Boldea
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)