One-Shot Averaging for Distributed TD(λ) Under Markov Sampling

We consider a distributed setup for reinforcement learning, where each agent has a copy of the same Markov Decision Process but transitions are sampled from the corresponding Markov chain independently by each agent. We show that in this setting, we can achiev... ...

请注册登录后继续浏览