Markov Decision Process

Markov Decision Process

Markov decision processes (MDPs), named after Andrey Markov, provide a mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker. MDPs are useful for studying a wide range of optimization problems solved via dynamic programming and reinforcement learning. MDPs were known at least as early as the 1950s (cf. Bellman 1957). A core body of research on Markov decision processes resulted from Ronald A. Howard's book published in 1960, Dynamic Programming and Markov Processes. They are used in a wide area of disciplines, including robotics, automated control, economics, and manufacturing.

More precisely, a Markov Decision Process is a discrete time stochastic control process. At each time step, the process is in some state, and the decision maker may choose any action that is available in state . The process responds at the next time step by randomly moving into a new state, and giving the decision maker a corresponding reward .

The probability that the process moves into its new state is influenced by the chosen action. Specifically, it is given by the state transition function . Thus, the next state depends on the current state and the decision maker's action . But given and, it is conditionally independent of all previous states and actions; in other words, the state transitions of an MDP possess the Markov property.

Markov decision processes are an extension of Markov chains; the difference is the addition of actions (allowing choice) and rewards (giving motivation). Conversely, if only one action exists for each state and all rewards are zero, a Markov decision process reduces to a Markov chain.

Read more about Markov Decision Process:  Definition, Problem, Algorithms, Continuous-time Markov Decision Process, Alternative Notations

Famous quotes containing the words decision and/or process:

    The women of my mother’s generation had, in the main, only one decision to make about their lives: who they would marry. From that, so much else followed: where they would live, in what sort of conditions, whether they would be happy or sad or, so often, a bit of both. There were roles and there were rules.
    Anna Quindlen (20th century)

    come peace or war, the progress of America and Europe
    Becomes a long process of deterioration—
    Robinson Jeffers (1887–1962)