Abstract
In this work, we study reinforcement learning (RL) in solving text-based games. We address the challenge of combinatorial action space, by proposing a confidence-based self-imitation model to generate action candidates for the RL agent. Firstly, we leverage the self-imitation learning to rank and exploit past valuable trajectories to adapt a pre-trained language model (LM) towards a target game. Then, we devise a confidence-based strategy to measure the LM’s confidence with respect to a state, thus adaptively pruning the generated actions to yield a more compact set of action candidates. In multiple challenging games, our model demonstrates promising performance in comparison to the baselines.- Anthology ID:
- 2023.eacl-main.50
- Volume:
- Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
- Month:
- May
- Year:
- 2023
- Address:
- Dubrovnik, Croatia
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 703–726
- Language:
- URL:
- https://aclanthology.org/2023.eacl-main.50
- DOI:
- Cite (ACL):
- Zijing Shi, Yunqiu Xu, Meng Fang, and Ling Chen. 2023. Self-imitation Learning for Action Generation in Text-based Games. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 703–726, Dubrovnik, Croatia. Association for Computational Linguistics.
- Cite (Informal):
- Self-imitation Learning for Action Generation in Text-based Games (Shi et al., EACL 2023)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/2023.eacl-main.50.pdf