Up to here, we've looked at utilities.
Okay? So, if you don't have any questions, we just basically put that into a little box,
make a nice bow tie around it, and we do something. We are assuming we have a utility function
and it's easy enough that we can actually deal with it.
So how do we actually make decisions? And the idea here is that we use our Bayesian
networks, which we have lying around anyway, for our world model, for our belief state
model, and that we just add little bells and whistles to that and add a little bit to our
algorithms to make progress on this.
So here's our utility-based agent again. We've talked about the world models. The first line
is essentially here. How does the world evolve? Then we have to model something about the
conditions. What are the consequences and so on? Remember last semester we were in the
planning and so on, that we did stuff like that. We talked about utility. And now the
question is, of course, how do I in practice decide what to do? And best if we could implement
it all together easily. And it turns out that the following works quite well. We have a
Bayesian network. That's the square part. We add actions as random variables essentially
to that network, new nodes in the network. And these actions have dependencies or influences.
Where the airport is influences the cost together with litigation and construction. It might
very well be that we also have more dependencies or influences. But essentially these are new
nodes which we're not actually assessing the prior probabilities for, but which we have
control over. And the idea is that we will just basically try out sufficiently many of
those. Maybe all the ones we're not discounting for common sense reasons. And then we can
compute given evidence essentially like situations where we just set the airport to be in the
center of Erlangen in the Schlosspark. That might be a possible site. Small airport maybe.
That we can just basically treat like evidence. We know what the variable airport site is.
And then we can compute the probabilities and the distributions of those here given
those. We have the techniques for that. And now the only thing we have to do is we have
to connect, we have to compute the utility of those. And we can again put the utility
as a node into our network. It takes a little bit of bending of course because so far we've
had only random variables that were discrete, meaning finitely many values. And the utility
is a real valued function. That doesn't quite fit our maths. But then we know what to do.
What will you do? We have a node that has an uncountable range. So it doesn't really
apply. Well we just make it finite. The range. How would you do that? Situation is, we have
the real numbers. Well let's just make it finite. We'll take those, call them one, two.
I'll just make intervals. One, two, three, four, five values. Wonderful. We took out the
big stick, hit the problem until it doesn't fight back. Of course we would like to have
better methods. And we can actually build Bayesian networks with infinite or non-discrete
ranges, but we haven't. So we in this class have to do something like this. And this is
even wrong. Anybody see the problem? I think we had utilities that were positive. But we
don't need that. Or non-negative actually. It doesn't change the problem though. Okay
good. So we add these two kinds of nodes. One of them you can think of as input nodes.
The other ones can be thought of as output nodes. And then we just basically have the
algorithm that essentially you have to cycle over the action nodes, give values to all
of them, push everything through the network, get the probability distribution for the utility,
and then compute and then maximize utility. You remember that was the maximum of the expected
utility was a big, essentially a big sum over all the actions, over all the states that
are reached by my action. And then there was the probability of the result being the state
over the evidence times the utility of that state. And something here which I forget,
but that's... So once we've driven everything through the Bayesian networks, we have the
probability distribution here. So we can actually compute the sum. And then essentially we want
to maximize this thing here. Which is essentially... You just basically put an argmax over alpha.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:14:48 Min
Aufnahmedatum
2021-03-30
Hochgeladen am
2021-03-31 11:57:24
Sprache
en-US
Recap: Decision Networks
Main video on the topic in chapter 5 clip 7.