So, and to understand that I would like to go over kind of the forms of learning, what
learning actually means.
So basically if it's true that the world is a POMDP, then we have to kind of learn transition
models and sensor models.
And it's also, as I already said, very nice to be able to learn because then you can have
agent that adapts to a changing environment, right?
The sensors are getting worse and worse.
You have to do something about that.
And of course, if you can learn, you can improve your performance just by being in the world.
And that's something that's been a very important kind of strand of AI philosophy is that whereas
early AI basically had the idea, well, let's write down an agent and then it'll do great
things.
Whereas now the idea is that you can't even approach and make AI without exposing an agent
to reality and giving it the chance to learn.
And as very often in AI, kind of the current fashion is to say, yeah, but if the agents
can learn, why would they actually know anything?
Right?
They can learn everything.
And so much of other stuff, which is about knowledge and so on, is becoming forgotten.
And to me, it seems that neither the kind of learning paradigm or the writing down knowledge
paradigm has won yet.
We don't have AI.
And my sense is that if we're ever going to reach AI or something like AI, then we should
really try to combining those things, which is why I would like to show you different
ways of doing stuff.
So the agent model, I've briefly shown it to you last semester, is I would like to see
is still the same overall shape.
We have the environment and we have an agent that gets percepts via the sensors and acts
on the world via some effectors.
But we have learning functionality added to it.
And here I have this new concept of a performance element.
Think about the old agents.
Put them into a little box.
And then add these three kind of things.
One is a learning element.
And we're going to mostly look at pure learning elements here.
What that does, it essentially changes the performance element.
Say we have a POMDP based performance element.
What could the learning element change?
Yes?
For instance, yes.
What else?
The sensor model.
What else?
Utilities, for instance, preferences.
Some people learn, oh, it's not so nice to eat meat after all because it actually meat
comes from these cute little animals that have big eyes, so I'm not going to like it
anymore.
Right?
So, these preferences, transition models, sensor models, reward models.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:20:26 Min
Aufnahmedatum
2021-03-30
Hochgeladen am
2021-03-30 15:26:31
Sprache
en-US
How a learning agent works and the different forms of learning.