Partially Observable Markov Decision Process

Belief MDP

The policy maps a belief state space into the action space. The optimal policy can be understood as the solution of a continuous space Markov Decision Process (so-called belief MDP). It is defined as a tuple where

is the set of belief states over the POMDP states,
is the same finite set of action as for the original POMDP,
is the belief state transition function,
is the reward function on belief states. It writes :

Note that this MDP is defined over a continuous state space.

Read more about this topic: Partially Observable Markov Decision Process

Famous quotes containing the word belief:

“My belief is that no being and no society composed of human beings ever did, or ever will, come to much unless their conduct was governed and guided by the love of some ethical ideal.”
—Thomas Henry Huxley (1825–95)

Partially Observable Markov Decision Process - Belief MDP

Famous quotes containing the word belief: