4 - Machine Learning for Physicists [ID:7836]
50 von 605 angezeigt

The following content has been provided by the University of Erlangen-Nürnberg.

OK, hello, good evening. This is the fourth lecture on machine learning. So first some

organizational points. In the end, in principle, you can take an exam. The question is how many

of you would like to get a grade for this lecture, roughly? OK, so I think this is so

many that I will replace the oral exams by an actual written exam. I hope that is OK with you.

We then still have to discuss when that will be. The other point is, on this Thursday,

there will be a kind of tutorial. So regardless of whether you have already done the homeworks or

you haven't done the homework, this Thursday here in this lecture hall at 6, Thomas Fersel will give

a tutorial covering all the different homeworks that we did. So maybe you want to have a look at

them again. And then you can discuss these things, including all the nitty-gritty details of the

programming. OK, so let's start. Last time, we finally made it to the mountaintop. So we really

understood, or at least I told you about back propagation, which is the algorithm that you use

in neural networks to train them. Once you know back propagation, basically, you can do anything

you want with neural networks. And so today, I want to do the following. I want to go through

back propagation once more in a slightly different way to tell you how it works, to remind you of

these things. And then we want to apply it. We want to apply it, for example, to compressing an

image that we present to the network instead of a function. OK, so what I'm going to do now is take

the large-scale overview. And I will not care about all the tiny indices, but rather you should get an

overview of what back propagation really does. So remember, what we wanted to do is to calculate

the gradient of the cost function with respect to some weight. All these network connections have

their weights. So this is what defines the network. And the cost function tells me what's the distance

between what the network should do and what the network currently actually does. And I want to

minimize this distance. And in order to do that, I have to calculate the gradient. So I want to

represent this pictorially, which I've tried to do here. This is my network with all the neurons

connected by the connections that have their associated weights. And at the final output neurons,

I placed a little symbol to denote the cost function. Because in order to calculate the cost

function, what you do is you take these output values of the network, you compare them to the

ideal correct output values for this particular input that you sent through the network. And then,

for example, you take this difference squared to get a number that is never below 0 and which

will decrease if you are doing better in training your network. So that then gives you the cost

function. The cost function is a single value. It's a scalar function. So now if I want to find out

how little changes in the cost function are connected to little changes in the weight,

I have to find my way through my network. And there will be a path through this network. Let

me do this. So first, when we take the actual derivative, we know what we get is this difference

between the output of the network and the ideal, the correct output that I would like to have. And

so what I will now do is I will go through this network. I will follow a single path. And I will

show you all the factors that occur when you calculate this derivative. And what I will also

do, because this is about memorizing things and not about looking at each and every single index,

what I will do is I will omit the indices. This is OK because in principle, the indices you could

deduce from this graphics, from this image. For example, here when I write down this factor in

principle, the y should come with an index, which stands for this particular neuron. If I had placed

this thick black line at the first neuron, then the y would carry a different index relating to the

first neuron. So that's the way to read this. OK, so this is the first factor, but it occurs only

once when I take the derivative of my cost function. And then you want to see how any change in the

input to this neuron affects the output value of the neuron. And that we already know that is taking

the derivative of this nonlinear function. So this neuron gets as its input the weighted sum of several

neurons in the layer below, and it spits out a nonlinear function f of z. So if I'm only interested

in the tiny changes, I have to look at the derivative. And then I proceed. So in calculating this

derivative, we have seen that any connection contributes with the weight of this connection.

And then here it's the same game as before again, so I would have to calculate the derivative of f

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:15:02 Min

Aufnahmedatum

2017-05-29

Hochgeladen am

2017-05-30 14:09:25

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen