Partially Observable Markov Decision Process - Belief MDP

Belief MDP

The policy maps a belief state space into the action space. The optimal policy can be understood as the solution of a continuous space Markov Decision Process (so-called belief MDP). It is defined as a tuple where

  • is the set of belief states over the POMDP states,
  • is the same finite set of action as for the original POMDP,
  • is the belief state transition function,
  • is the reward function on belief states. It writes :

.

Note that this MDP is defined over a continuous state space.

Read more about this topic:  Partially Observable Markov Decision Process

Famous quotes containing the word belief:

    We that are bound by vows and by promotion,
    With pomp of holy sacrifice and rites,
    To teach belief in good and still devotion,
    To preach of heaven’s wonders and delights—
    Yet, when each of us in his own heart looks,
    He finds the God there far unlike his books.
    Fulke Greville (1554–1628)