首页 正文

Discrete Event Dynamic Systems. 2003;13(1-2):111-148. doi: 10.1023/a:1022145020786 Q31.02025

Approximate Gradient Methods in Policy-Space Optimization of Markov Reward Processes

马尔可夫奖励过程策略空间优化中的近似梯度方法

Peter Marbach; John N. Tsitsiklis

DOI: 10.1023/a:1022145020786

摘要 查看摘要

Copyright © Discrete Event Dynamic Systems. 中文内容为AI机器翻译,仅供参考!

期刊名:Discrete event dynamic systems-theory and applications

缩写:DISCRETE EVENT DYN S

ISSN:0924-6703

e-ISSN:1573-7594

IF/分区:1.0/Q3

文章目录 更多期刊信息

全文链接
引文链接
复制
已复制!
推荐内容
Approximate Gradient Methods in Policy-Space Optimization of Markov Reward Processes