Improving Reinfocement Learning Agent Training using Text based Guidance: A study using Commands in Dravidian Languages

Nikhil Chowdary Paleti; Sai Aravind Vadlapudi; Sai Aashish Menta; Sai Akshay Menta; Vishnu Vardhan Gorantla V N S L; Janakiram Chandu; Soman K. P.; Sachin Kumar S

Improving Reinfocement Learning Agent Training using Text based Guidance: A study using Commands in Dravidian Languages

Nikhil Chowdary Paleti, Sai Aravind Vadlapudi, Sai Aashish Menta, Sai Akshay Menta, Vishnu Vardhan Gorantla V N S L, Janakiram Chandu, Soman K P, Sachin Kumar S

Abstract

Reinforcement learning (RL) agents have achieved remarkable success in various domains, such as game-playing and protein structure prediction. However, most RL agents rely on exploration to find optimal solutions without explicit guidance. This paper proposes a methodology for training RL agents using text-based instructions in Dravidian Languages, including Telugu, Tamil, and Malayalam along with using the English language. The agents are trained in a modified Lunar Lander environment, where they must follow specific paths to successfully land the lander. The methodology involves collecting a dataset of human demonstrations and textual instructions, encoding the instructions into numerical representations using text-based embeddings, and training RL agents using state-of-the-art algorithms. The results demonstrate that the trained Soft Actor-Critic (SAC) agent can effectively understand and generalize instructions in different languages, outperforming other RL algorithms such as Proximal Policy Optimization (PPO) and Deep Deterministic Policy Gradient (DDPG).

Anthology ID:: 2023.dravidianlangtech-1.5
Volume:: Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages
Month:: September
Year:: 2023
Address:: Varna, Bulgaria
Editors:: Bharathi R. Chakravarthi, Ruba Priyadharshini, Anand Kumar M, Sajeetha Thavareesan, Elizabeth Sherly
Venues:: DravidianLangTech | WS
SIG:
Publisher:: INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:: 33–42
Language:
URL:: https://aclanthology.org/2023.dravidianlangtech-1.5
DOI:
Bibkey:
Cite (ACL):: Nikhil Chowdary Paleti, Sai Aravind Vadlapudi, Sai Aashish Menta, Sai Akshay Menta, Vishnu Vardhan Gorantla V N S L, Janakiram Chandu, Soman K P, and Sachin Kumar S. 2023. Improving Reinfocement Learning Agent Training using Text based Guidance: A study using Commands in Dravidian Languages. In Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages, pages 33–42, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):: Improving Reinfocement Learning Agent Training using Text based Guidance: A study using Commands in Dravidian Languages (Paleti et al., DravidianLangTech-WS 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/proper-vol2-ingestion/2023.dravidianlangtech-1.5.pdf

PDF Search