27 - Recap Clip 5.7: Decision Networks [ID:30429]
50 von 85 angezeigt

Up to here, we've looked at utilities.

Okay? So, if you don't have any questions, we just basically put that into a little box,

make a nice bow tie around it, and we do something. We are assuming we have a utility function

and it's easy enough that we can actually deal with it.

So how do we actually make decisions? And the idea here is that we use our Bayesian

networks, which we have lying around anyway, for our world model, for our belief state

model, and that we just add little bells and whistles to that and add a little bit to our

algorithms to make progress on this.

So here's our utility-based agent again. We've talked about the world models. The first line

is essentially here. How does the world evolve? Then we have to model something about the

conditions. What are the consequences and so on? Remember last semester we were in the

planning and so on, that we did stuff like that. We talked about utility. And now the

question is, of course, how do I in practice decide what to do? And best if we could implement

it all together easily. And it turns out that the following works quite well. We have a

Bayesian network. That's the square part. We add actions as random variables essentially

to that network, new nodes in the network. And these actions have dependencies or influences.

Where the airport is influences the cost together with litigation and construction. It might

very well be that we also have more dependencies or influences. But essentially these are new

nodes which we're not actually assessing the prior probabilities for, but which we have

control over. And the idea is that we will just basically try out sufficiently many of

those. Maybe all the ones we're not discounting for common sense reasons. And then we can

compute given evidence essentially like situations where we just set the airport to be in the

center of Erlangen in the Schlosspark. That might be a possible site. Small airport maybe.

That we can just basically treat like evidence. We know what the variable airport site is.

And then we can compute the probabilities and the distributions of those here given

those. We have the techniques for that. And now the only thing we have to do is we have

to connect, we have to compute the utility of those. And we can again put the utility

as a node into our network. It takes a little bit of bending of course because so far we've

had only random variables that were discrete, meaning finitely many values. And the utility

is a real valued function. That doesn't quite fit our maths. But then we know what to do.

What will you do? We have a node that has an uncountable range. So it doesn't really

apply. Well we just make it finite. The range. How would you do that? Situation is, we have

the real numbers. Well let's just make it finite. We'll take those, call them one, two.

I'll just make intervals. One, two, three, four, five values. Wonderful. We took out the

big stick, hit the problem until it doesn't fight back. Of course we would like to have

better methods. And we can actually build Bayesian networks with infinite or non-discrete

ranges, but we haven't. So we in this class have to do something like this. And this is

even wrong. Anybody see the problem? I think we had utilities that were positive. But we

don't need that. Or non-negative actually. It doesn't change the problem though. Okay

good. So we add these two kinds of nodes. One of them you can think of as input nodes.

The other ones can be thought of as output nodes. And then we just basically have the

algorithm that essentially you have to cycle over the action nodes, give values to all

of them, push everything through the network, get the probability distribution for the utility,

and then compute and then maximize utility. You remember that was the maximum of the expected

utility was a big, essentially a big sum over all the actions, over all the states that

are reached by my action. And then there was the probability of the result being the state

over the evidence times the utility of that state. And something here which I forget,

but that's... So once we've driven everything through the Bayesian networks, we have the

probability distribution here. So we can actually compute the sum. And then essentially we want

to maximize this thing here. Which is essentially... You just basically put an argmax over alpha.

Teil eines Kapitels:
Recaps

Zugänglich über

Offener Zugang

Dauer

00:14:48 Min

Aufnahmedatum

2021-03-30

Hochgeladen am

2021-03-31 11:57:24

Sprache

en-US

Recap: Decision Networks

Main video on the topic in chapter 5 clip 7.

Einbetten
Wordpress FAU Plugin
iFrame
Teilen