Abstract
Reliable tagging of Temporal Expressions (TEs, e.g., Book a table at L’Osteria for Sunday evening) is a central requirement for Voice Assistants (VAs). However, there is a dearth of resources and systems for the VA domain, since publicly-available temporal taggers are trained only on substantially different domains, such as news and clinical text. Since the cost of annotating large datasets is prohibitive, we investigate the trade-off between in-domain data and performance in DA-Time, a hybrid temporal tagger for the English VA domain which combines a neural architecture for robust TE recognition, with a parser-based TE normalizer. We find that transfer learning goes a long way even with as little as 25 in-domain sentences: DA-Time performs at the state of the art on the news domain, and substantially outperforms it on the VA domain.- Anthology ID:
- 2021.iwcs-1.14
- Volume:
- Proceedings of the 14th International Conference on Computational Semantics (IWCS)
- Month:
- June
- Year:
- 2021
- Address:
- Groningen, The Netherlands (online)
- Editors:
- Sina Zarrieß, Johan Bos, Rik van Noord, Lasha Abzianidze
- Venue:
- IWCS
- SIG:
- SIGSEM
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 144–154
- Language:
- URL:
- https://aclanthology.org/2021.iwcs-1.14
- DOI:
- Cite (ACL):
- Touhidul Alam, Alessandra Zarcone, and Sebastian Padó. 2021. New Domain, Major Effort? How Much Data is Necessary to Adapt a Temporal Tagger to the Voice Assistant Domain. In Proceedings of the 14th International Conference on Computational Semantics (IWCS), pages 144–154, Groningen, The Netherlands (online). Association for Computational Linguistics.
- Cite (Informal):
- New Domain, Major Effort? How Much Data is Necessary to Adapt a Temporal Tagger to the Voice Assistant Domain (Alam et al., IWCS 2021)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/2021.iwcs-1.14.pdf
- Code
- audiolabs/da-time
- Data
- SNIPS