4 - Machine Learning for Physicists [ID:11266]

50 von 1080 angezeigt

Good evening everyone.

I want to start with a short reminder of what we actually learned at the end of last lecture.

We were looking into image recognition and the particular test case is a very famous

and important one to recognize digits, handwritten digits.

And so we said how this is done in principle that you would construct a neural network

that takes as input an image and it outputs several different neurons each of which corresponds

to one particular digit.

So if it recognizes the digit number three then the neuron that belongs to digit number

three will fire.

And so we learned several different things like one-hot encoding and categorical cross-entropy

as a cost function.

But in the end when we implemented it we stumbled upon some interesting behavior.

So these are some images of digits where the network actually misclassifies them.

And if we count for the experiment, the numerical experiment that we have been doing here, how

many are misclassified then was about 77 percent.

And that was in spite of the fact that it seemed like the accuracy during training is

very very good and you seem to have only three percent error on the training samples.

And so that was a little bit of a mystery which is resolved by recognizing that you

have to be very honest in assessing how good you are doing.

And if you only assess the quality on the training examples where you already trained

the network on then this is not giving you a fair assessment of the quality of the network.

So what people do, again we discussed this last time and it's summarized again here,

is they have a training set on which they train the network and then they have a validation

set which the network never sees for the training it's never trained on but you can always use

this to assess the accuracy of the network during training and then you can see how it's

getting better and better.

And then independent of that there are the images to which it will be applied finally

or which form the test set so that's completely independent of everything.

Okay, and so if you adopt this if you have a validation set on which you do not train

then you see this interesting behavior that is shown here.

The accuracy on the training data may increase increase increase over time as you train more

and more on these training samples but the accuracy on the validation data may actually

level off at a much lower value and it might even decrease again it's not so easily visible

here but indicated it by the arrow and that's of course bad so you are training more and

more but the accuracy on the validation data even decreases.

So what's going on is the question.

And so what's going on is known as the phenomenon of overfitting.

So what really happens is basically the network memorizes these training examples.

You've shown these training examples again and again and again to the network and so

at some point it really knows oh this picture where the pixels are arranged in exactly precisely

this order that's the three because I learned it because someone told me many many times

that this picture must be labeled three.

But that doesn't mean that it can generalize to other pictures of three that look a little

bit different because the pixels are in a slightly different shape.

So that's really bad it's like a student who doesn't know what they are doing and just

memorizes something and then if you ask a new question they cannot answer.

So well the solution is first of all to always have this honest assessment of the accuracy

by measuring it against the validation data which the network is not being trained on.

And then you would do this that you stop the training after the validation accuracy has

reached its maximum because after that things are only getting worse after that the network

Teil einer Videoserie :

Machine Learning for Physicists

Presenters

Prof. Dr. Florian Marquardt

Zugänglich über

Offener Zugang

Dauer

01:27:15 Min

Aufnahmedatum

2019-05-15

Hochgeladen am

2019-05-16 05:19:03

Sprache

en-US

This is a course introducing modern techniques of machine learning, especially deep neural networks, to an audience of physicists. Neural networks can be trained to perform diverse challenging tasks, including image recognition and natural language processing, just by training them on many examples. Neural networks have recently achieved spectacular successes, with their performance often surpassing humans. They are now also being considered more and more for applications in physics, ranging from predictions of material properties to analyzing phase transitions. We will cover the basics of neural networks, convolutional networks, autoencoders, restricted Boltzmann machines, and recurrent neural networks, as well as the recently emerging applications in physics.

Tags

Per RSS abonnieren