On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems
Pei-Hao Su, Milica Gašić, Nikola Mrkšić, Lina M. Rojas-Barahona, Stefan Ultes, David Vandyke, Tsung-Hsien Wen, Steve Young
- Anthology ID:
- P16-1230
- Volume:
- Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- August
- Year:
- 2016
- Address:
- Berlin, Germany
- Editors:
- Katrin Erk, Noah A. Smith
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2431–2441
- Language:
- URL:
- https://aclanthology.org/P16-1230
- DOI:
- 10.18653/v1/P16-1230
- Cite (ACL):
- Pei-Hao Su, Milica Gašić, Nikola Mrkšić, Lina M. Rojas-Barahona, Stefan Ultes, David Vandyke, Tsung-Hsien Wen, and Steve Young. 2016. On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2431–2441, Berlin, Germany. Association for Computational Linguistics.
- Cite (Informal):
- On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems (Su et al., ACL 2016)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/P16-1230.pdf