state & evidence Variable
state: die “Ursache”
evidence: die “Folge”
In terms of the stock market example, explain what probabilities are computed by prediction, filtering and smoothing. You do not need to give the formulas.
when is a hidden markov model stationery?
if the probabilities do NOT depend on day d.
when is a hidden markov model in order 1?
if the probabilities of day d+1 only depend on values on at d
recursive filtering equation
recursive filtering equation anwendung
O1 und O2 angeben: auf der Diagonale stehen die werte von dem zustand vor d tagen
f1:0 = <0,0,1>, da die 1 einsetzen welcher zustand an tag 0 war (?)
“your client was optimistic two weeks ago with a probability of 0.6”
—> <0.6, 0.4>
alpha is a constant factor to normalize the distribution
bellman equation
bellman equation anwendung
markov decision process
The algorithm keeps a table 𝑈(𝑠) for 𝑠 ∈ 𝑆, that is initialized with ____________
arbitrary values, e.g. all 0 or the rewards.
In each iteration, it uses the ___________________
in order to _____________________
Bellman equation
update U(s)
𝑈(𝑠) will converge to the _____________
expected utility of 𝑠.
Let w be a 3-dimensional vector whose coeficients sum to 1. What is the 2 pt intuitive meaning of the property T · w = w?
w is a probability distribution of the weather that is a fixed point of the transition model, i.e., the distribution will stay the same when predicting the future.
Markow Decision Process:
Now assume the agent is unable to tell whether an action resulted in a move or not. Explain informally how that would change the modeling.
We would need a POMDP (partially observable markow decision process). A state in the POMDP is a so-called belief state, a probability distribution for the MDP-state that the agent is in.
Zuletzt geändertvor 8 Monaten