首页 正文

Uncertainty and Exploration

{{output}}
In order to discover the most rewarding actions, agents must collect information about their environment, potentially foregoing reward. The optimal solution to this "explore-exploit" dilemma is often computationally challenging, but principled algorithmic appr... ...