Natural Language Reinforcement Learning

Feng, Xidong; Wan, Ziyu; Fu, Haotian; Liu, Bo; Yang, Mengyue; Koushik, Girish A.; Hu, Zhiyuan; Wen, Ying; Wang, Jun

Computer Science > Machine Learning

arXiv:2411.14251 (cs)

[Submitted on 21 Nov 2024]

Title:Natural Language Reinforcement Learning

Authors:Xidong Feng, Ziyu Wan, Haotian Fu, Bo Liu, Mengyue Yang, Girish A. Koushik, Zhiyuan Hu, Ying Wen, Jun Wang

View PDF

Abstract:Reinforcement Learning (RL) mathematically formulates decision-making with Markov Decision Process (MDP). With MDPs, researchers have achieved remarkable breakthroughs across various domains, including games, robotics, and language models. This paper seeks a new possibility, Natural Language Reinforcement Learning (NLRL), by extending traditional MDP to natural language-based representation space. Specifically, NLRL innovatively redefines RL principles, including task objectives, policy, value function, Bellman equation, and policy iteration, into their language counterparts. With recent advancements in large language models (LLMs), NLRL can be practically implemented to achieve RL-like policy and value improvement by either pure prompting or gradient-based training. Experiments over Maze, Breakthrough, and Tic-Tac-Toe games demonstrate the effectiveness, efficiency, and interpretability of the NLRL framework among diverse use cases. Our code will be released at this https URL.

Comments:	Extension of arXiv:2402.07157
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2411.14251 [cs.LG]
	(or arXiv:2411.14251v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2411.14251

Submission history

From: Xidong Feng [view email]
[v1] Thu, 21 Nov 2024 15:57:02 UTC (1,796 KB)

Computer Science > Machine Learning

Title:Natural Language Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Natural Language Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators