Let's get started for today. Welcome back. So last time we talked about recurrent networks,
which are networks that take into account memory. We also talked about word vector
representations and how to do calculations with the meanings of words. And what I'm displaying
right here is, so to speak, the table of contents so far in this lecture series. So we started really
with default neural networks that had no special structure, but which we could already use, say,
for image classification. And then we moved on to convolutional networks that had a particular
structure in order to make use of the translational invariance. And then we moved to autoencoders,
which were a way of representation learning, also very interesting on its own. And then
visualization. And then finally what we discussed last time, recurrent networks and word vectors.
So what I want to start discussing today is another big branch of machine learning,
which goes under the name of reinforcement learning. So in total, people like to subdivide
machine learning sometimes into three directions. The first really big one is supervised learning.
That's the thing that we have been discussing most of the time. So you have many training
examples. You know what's the input and the correct output, like the image and the label.
So that is supervised learning, and maybe it takes 80% or so of machine learning. And then
there's this other part, which we only discussed relatively briefly, unsupervised learning, which
we had in the form of autoencoders. So there you don't have any labels, you just have images,
and still you want to somehow compress the information or extract the essence of the
information. Sometimes you also want to cluster things like in this TSME representation, so that
would be unsupervised learning. And then the third big branch is this reinforcement learning
that we are now going to study. And reinforcement learning is for all those cases where you do not
know the correct solution, rather you want to find the correct solution that optimizes something.
And it's the task of a neural network to find this optimal solution. And we will see more in a moment
what this really means. Okay. So this will be a lecture about reinforcement learning, and probably
we will also use part of next time to discuss this. What you see in the background is already kind of
setting the stage, because that is a board game. And one of the things you can do using reinforcement
learning is to discover good strategies to play games. But of course, much more than that.
So this tells the same story in slightly different pictures. Supervised learning, which again covers
most of machine learning, is really like having a teacher that is super smart and that tries to
teach a student by always giving examples of questions together with a correct answer. And
after a while, maybe the student starts to memorize the answers, and maybe after a while even slightly
generalizes these answers. So even if the student now sees an image of a dog that doesn't exactly
look like any of the dogs that it has seen before, it can still announce the correct label. The
problem with this is that the final level is obviously limited, because probably the student
will never really get better than the teacher in doing these tasks. Now the question is if you are
a student but really ambitious and want to become better than your teacher, if you are a scientist
and want to discover new things, then the question is what do you do? And so there is really no very
deep answer to this. So when you are faced with the unknown, so to speak, and want to figure out
smart strategies to solve problems that no one has solved before, the only thing that you can do
at first is just some trial and error. You try out this, you try out that, and most of the time it
doesn't work. Once in a while you stumble across something that works at least a little bit, and
then you probably should keep this, and then you should build on this by modifying it, by trying out
different versions of that. And so step by step you can actually become better. And that's the basic
idea of reinforcement learning. So you are reinforcing the few things that work, you are keeping them in
your repertoire, and you're building on those. And there you don't need a teacher, obviously,
no one needs to tell you what is the good strategy to solve a problem. The only thing that you need
to know is what constitutes a good solution. So if you are given a strategy that someone applies,
then you should be able to tell, yes, this is pretty good, yes, you are winning this game, at least
this information should be there. But other than that, you don't need to know anything. And so
therefore then you are not limited by the level of any teacher or of any training database or so,
Presenters
Zugänglich über
Offener Zugang
Dauer
01:33:36 Min
Aufnahmedatum
2024-06-27
Hochgeladen am
2024-06-28 11:09:03
Sprache
en-US