So now that we've looked at decision making for episodic environments, i.e. environments
that allow you to kind of only optimize your actions based on the immediate outcomes, we
would like to progress to sequential environments, environments where you actually have to reason
about sequences of actions. And for that we have to do something new, we have to actually
directly and explicitly model time and uncertainty. This is exactly what we're going to look
at in this chapter, and we'll start out with the introduction, essentially which is modeling
time and uncertainty for sequential environments. Later we'll go into hidden Markov models and
implement them, if you will, in dynamic Bayesian networks and look at algorithms for that.
But now we still want to basically only look at time and uncertainty and how to model them
in an uncertainty Bayesian network-like framework. So the thing is in episodic environments the
world changes and we need to track and predict and work with these. And the difference essentially
is things like in vehicle diagnosis, when you have a car, it's broken and you put it
on the ramp and then you can look at it. And essentially, unless something is burning,
of course, you can take all the time, nothing is going to change while you're diagnosing
the vehicle. If on the other hand you have fast acting, for instance, conditions in human
health like diabetes or so, where if you basically take 10 minutes or so and you have a patient
in diabetic shock, then you cannot take all the time you want. You have to do it under
time explicit time constraints, otherwise your patient is dead when you've decided.
So we're doing things like diabetes management now. So we're going to define a temporal probability
model to be a probability model where the possible worlds are indexed by some kind of
a time structure. And here I want to be, in the beginning we want to be very general,
so it's basically just some kind of a pre-ordered set. We're going to restrict ourselves in
practice, of course, to linear discrete time structures. Essentially our time structure
is always going to be the natural numbers with the lesser equal ordering. And in theory,
really the step size is irrelevant, but of course how fast you want to have your time
steps really depends on the main and the problem in practice. So the basic set up we want to
look at is the following. We divide the random variables, which are indexed by the natural
numbers into two sets, a set X of state variables, which are indexed for t equals zero and further.
For instance, you could, in the diabetes case, you could have blood sugar and stomach contents
and all of those kind of state variables. They describe the state of the world. They
are typically unobservable, right? Unless in diabetes you measure the blood sugar, there's
no way of actually seeing it. And then you have the evidence variables, which might be
the measured blood sugar, the pulse rate, you can see the food that has been eaten and
so on. And these two kinds of things we're going to reasoning about, and we're always
going to call, we're always going to call have the state variables tendentially called
X and the evidence variables called E. One more thing is, so we have these X's that
are now, and that's the new thing, indexed by little t for time steps. And we're going
to use the notation X lower A colon B is the set of time variables between time step A
and time step B. We're going to use a running examples where you're a security guard in
a secret underground facility. You want to know whether it's raining outside and your
only source of information is whether the director who actually exits, lives at home,
exits the facility and comes in with an umbrella in the morning. So in this case, the state
variables are whether it rains and you have the observations whether there's an umbrella
or not. In this example, we're not going to allow you to actually speak to the director
and ask him whether it's raining outside or not. Okay. So that brings us to a very important
concept here, which is the concept of Markov processes. And the idea is that we want to
somehow, in the end, have some kind of a Bayesian network from these variables. And we have
to kind of look at what these look like. So we're going to say that the Markov property,
which is something we want because it makes modeling easier, if Xt only depends on a bounded
subset of X from zero to t minus one. Okay. Remember we have the Xs, the state variables
Presenters
Zugänglich über
Offener Zugang
Dauer
00:27:22 Min
Aufnahmedatum
2021-05-10
Hochgeladen am
2021-05-10 08:47:59
Sprache
en-US