Partially Observable Markov Decision Process

Belief Update

An agent needs to update its belief upon taking the action and observing . Since the state is Markovian, maintaining a belief over the states solely requires knowledge of the previous belief state, the action taken, and the current observation. The operation is denoted . Below we describe how this belief update is computed.

In, the agent observes with probability . Let be a probability distribution over the state space : denotes the probability that the environment is in state . Given, then after taking action and observing ,

$b'(s') = \eta \Omega(o\mid s',a) \sum_{s\in S} T(s'\mid s,a)b(s)$

where is a normalizing constant with .

Read more about this topic: Partially Observable Markov Decision Process

Famous quotes containing the word belief:

“The most common error made in matters of appearance is the belief that one should disdain the superficial and let the true beauty of one’s soul shine through. If there are places on your body where this is a possibility, you are not attractive—you are leaking.”
—Fran Lebowitz (b. 1951)

Partially Observable Markov Decision Process - Belief Update

Famous quotes containing the word belief: