Theory
The theory for small, finite MDPs is quite mature. Both the asymptotic and finite-sample behavior of most algorithms is well-understood. As mentioned beforehand, algorithms with provably good online performance (addressing the exploration issue) are known. The theory of large MDPs needs more work. Efficient exploration is largely untouched (except for the case of bandit problems). Although finite-time performance bounds appeared for many algorithms in the recent years, these bounds are expected to be rather loose and thus more work is needed to better understand the relative advantages, as well as the limitations of these algorithms. For incremental algorithm asymptotic convergence issues have been settled. Recently, new incremental, temporal-difference-based algorithms have appeared which converge under a much wider set of conditions than was previously possible (for example, when used with arbitrary, smooth function approximation).
Read more about this topic: Reinforcement Learning
Famous quotes containing the word theory:
“The theory of the Communists may be summed up in the single sentence: Abolition of private property.”
—Karl Marx (18181883)
“Many people have an oversimplified picture of bonding that could be called the epoxy theory of relationships...if you dont get properly glued to your babies at exactly the right time, which only occurs very soon after birth, then you will have missed your chance.”
—Pamela Patrick Novotny (20th century)
“In the theory of gender I began from zero. There is no masculine power or privilege I did not covet. But slowly, step by step, decade by decade, I was forced to acknowledge that even a woman of abnormal will cannot escape her hormonal identity.”
—Camille Paglia (b. 1947)