Learning parametric policies and transition probability models of markov decision processes from data [0.03%]
从数据学习马尔可夫决策过程的参数策略和转移概率模型
Tingting Xu,Henghui Zhu,Ioannis Ch Paschalidis
Tingting Xu
We consider the problem of estimating the policy and transition probability model of a Markov Decision Process from data (state, action, next state tuples). The transition probability and policy are assumed to be parametric functions of a s...
Rajat Bhatnagar,Hana El-Samad
Rajat Bhatnagar
Biological systems must sense and adapt to changes in their environment. Molecular networks capable of such adaptation belong to two well-known classes, feed-forward and feedback structures, but the fundamental limitations and tradeoffs of ...