Partially Observable Markov Decision Process - Belief Update

Belief Update

An agent needs to update its belief upon taking the action and observing . Since the state is Markovian, maintaining a belief over the states solely requires knowledge of the previous belief state, the action taken, and the current observation. The operation is denoted . Below we describe how this belief update is computed.

In, the agent observes with probability . Let be a probability distribution over the state space : denotes the probability that the environment is in state . Given, then after taking action and observing ,


b'(s') = \eta \Omega(o\mid s',a) \sum_{s\in S} T(s'\mid s,a)b(s)

where is a normalizing constant with .

Read more about this topic:  Partially Observable Markov Decision Process

Famous quotes containing the word belief:

    The source of Pyrrhonism comes from failing to distinguish between a demonstration, a proof and a probability. A demonstration supposes that the contradictory idea is impossible; a proof of fact is where all the reasons lead to belief, without there being any pretext for doubt; a probability is where the reasons for belief are stronger than those for doubting.
    Andrew Michael Ramsay (1686–1743)