Good. But we don't do that here.
What we do instead is we look at decision theory.
Because just having clever agents is not enough.
We have agents that actually act on the world.
Because you can't really look into this box.
We're doing that because we're synthesizing them.
But when you run across an agent in the world,
be it a self-driving car or another computer science student,
or a fox running over the street or something like this,
there's no way to look into those.
So we're looking into that,
but essentially unless these agents act on the world,
we have no way of understanding what they do,
and whether they might be intelligent or not.
That's what I'm trying to say. Good.
Okay. So we're going to do decision theory.
Basically, we're going to look at an agent,
the case of an agent in a non-deterministic world,
non-deterministic in various degrees.
We're going to make certain assumptions here,
but we're going to use Bayesian networks as
our world model to build on.
We're going to look at utility-based agents,
instead of reflex agents or something like this.
The realization is that this actually gives us what we've been
looking for in the last time,
namely, it gives us rationality.
Rationality defined as optimizing the expected outcome
over the long term.
We have actions, and for every action,
we can compute the expected utility.
Whatever utility it is,
we will just assume a function into the positive reals for now.
The utility of an action,
given some evidence about the world,
which is exactly the problem
this choice module needs to solve.
Given a world model,
how can I do the best action?
Where action and where best is defined in terms of utility.
That is something that's very simple in principle.
You just sum over all the states,
over all the possible states that can be successor states.
You're looking at the probability of the result of your action.
Remember, actions don't need to be deterministic.
Being this state, we're summing over,
given that we make an action,
and we have the same evidence we take as input.
For each of those probabilities,
we weight that with the utility of the state we reach in there.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:05:08 Min
Aufnahmedatum
2021-03-30
Hochgeladen am
2021-03-31 10:48:11
Sprache
en-US
Recap: Introduction
Main video on the topic in chapter 5 clip 1.