Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse

We develop a new class of model-free deep reinforcement learning algorithms for data-driven, learning-based control. Our Generalized Policy Improvement algorithms combine the policy improvement guarantees of on-policy methods with the efficiency of sample reus... ...

请注册登录后继续浏览