Reinforcement Learning - Theory

Theory

The theory for small, finite MDPs is quite mature. Both the asymptotic and finite-sample behavior of most algorithms is well-understood. As mentioned beforehand, algorithms with provably good online performance (addressing the exploration issue) are known. The theory of large MDPs needs more work. Efficient exploration is largely untouched (except for the case of bandit problems). Although finite-time performance bounds appeared for many algorithms in the recent years, these bounds are expected to be rather loose and thus more work is needed to better understand the relative advantages, as well as the limitations of these algorithms. For incremental algorithm asymptotic convergence issues have been settled. Recently, new incremental, temporal-difference-based algorithms have appeared which converge under a much wider set of conditions than was previously possible (for example, when used with arbitrary, smooth function approximation).

Read more about this topic:  Reinforcement Learning

Famous quotes containing the word theory:

    Every theory is a self-fulfilling prophecy that orders experience into the framework it provides.
    Ruth Hubbard (b. 1924)

    If my theory of relativity is proven correct, Germany will claim me as a German and France will declare that I am a citizen of the world. Should my theory prove untrue, France will say that I am a German and Germany will declare that I am a Jew.
    Albert Einstein (1879–1955)

    There is in him, hidden deep-down, a great instinctive artist, and hence the makings of an aristocrat. In his muddled way, held back by the manacles of his race and time, and his steps made uncertain by a guiding theory which too often eludes his own comprehension, he yet manages to produce works of unquestionable beauty and authority, and to interpret life in a manner that is poignant and illuminating.
    —H.L. (Henry Lewis)