PAC Reinforcement Learning Algorithm for General-Sum Markov Games

This paper presents a theoretical framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) algorithms for Markov games. Using the idea of delayed Q-learning, the paper extends the well-known Nash Q-learning algorithm to buil... ...

请注册登录后继续浏览