So we've heard we need utilities so that we can actually maximize the expected utilities.
So you can ask yourselves where do the utilities come from?
And that's what we're going to talk in the next two video nuggets.
So the problem is that we face is that we can't really directly measure utility or how
with my satisfaction with a state or my happiness in a state as an agent.
For instance, if I have to decide whether I want to go to class today or sleep in, what's
the utility of this lecture?
Well apart from the fact that it's obviously 42, what does that even mean?
But we already see I have to decide whether I go to class or sleep in.
And that's usually something we make decisions of, which we can actually decide whether that's
the best decision.
But we usually decide those things.
So we can actually see what agents actually choose, people.
And so even if we can't define even or assess utilities, we can actually find out preferences.
Another example.
It tells you give me your phone or I will give you a bloody notes.
So you have to make a decision about that.
Do I want to, do I prefer a state without the phone to one where I have a bloody notes?
And in a deterministic environment, that actually works.
I can make, take an action, typically hand over your phone or not.
So given two states, which we'll call prizes in when we're talking about preferences, we
can express preferences of the form A is curly greater B, which means I prefer or the agent
prefers A to B, then A squiggly.
B means the agent is in different between A and B. And of course, A we have the non-strict
ordering, in which case B is not preferred over A. But in non-deterministic environments,
we often don't have the full information about the states we choose between.
For instance, because my actions are non-deterministic and I don't know whether the action will really
bring about the choice I make.
Typical thing is you get food in an airplane and then you're typically being asked, do
you want chicken or pasta?
We can't see through the foil and we don't know whether we'll like what we're getting.
So we basically take a random shot or idly sight.
So in a non-deterministic setting, we also we extend the preferences between choices
to include what we call lotteries. Lotteries are something where we have impossible outcomes,
which occur with probabilities P1 to Pn, where the sum of all these says that it's a full
distribution.
And so that might be the result of a non-deterministic actions, which can have A1 to An as the output
of outcomes with probability Pi.
Very often we can get by with a binary case where we use this lottery P. With P, I get
outcome A and one minus P outcome B.
So we need to extend the preferences to include lotteries in non-deterministic environments.
So you need to make the decision of somebody tells you, give me your phone or I'll give
you a bloody nose.
Then you kind of weigh the chances and you decide you can decide to either hand over
the phone or kind of go in self-defense and with a probability of P, you actually win
the fight and with a probability of one minus P, you get a bloody nose anyway.
So that's the typical thing you want to decide between the prize and the lottery or even
the general case to lotteries.
Now if you look at preferences, you'll realize that for the preferences to be irrational,
we say, those have to kind of obey a certain number of constraints.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:12:36 Min
Aufnahmedatum
2021-05-09
Hochgeladen am
2021-05-09 20:18:22
Sprache
en-US