One-Shot Averaging for Distributed TD(λ) Under Markov Sampling
{{output}}
We consider a distributed setup for reinforcement learning, where each agent has a copy of the same Markov Decision Process but transitions are sampled from the corresponding Markov chain independently by each agent. We show that in this setting, we can achiev... ...