18 - Deep Learning - Regularization Part 2 [ID:15353]

50 von 178 angezeigt

Welcome back to Deep Learning and we want to continue our analysis of regularization

methods and today I want to talk about classical techniques.

The field is kind of stabilized to the point where some core ideas from the 1980s are still

used today.

And that was my 1987 diploma thesis which was all about that.

So here is a typical example on a loss curve over the iterations on the training set and

what I show here on the right hand side is the loss curve on the test set.

And you see that although the training loss goes down, the test loss goes up.

So at some point the training data set is overfitted and it doesn't produce a model that is representative

for the data anymore.

By the way, always keep in mind that the test set must never be used for training.

If you're trained on your test set, then you will get very good results, but it's very

likely to be a complete overestimate of the performance.

So there's the typical situation that somebody runs into my office and says, yes, I have

99% recognition rate.

The first thing that somebody in pattern recognition or machine learning does when he reads 99%

recognition rate, did you train on your test data?

This is the very first thing you make sure that has not happened.

And then you did some stupid mistake, there's some data set pointer that was not pointing

to the right data set and suddenly your recognition rate breaks in.

So be careful.

If you have very good results, always scrutinize that they're really appropriate and that they're

really general.

So you have to be very careful about this.

Because it doesn't work.

So instead, if you want to produce curves like the ones that I'm showing here, you may

want to use a validation state that you take off the training data set you never use in

training, but you can use it to get an estimate for your model overfitting.

So if you do that, then we can already use the first trick, use the validation set, we

observe at what point we have the minimum error in the validation set.

And if we're at this point, we can use that as a stopping criterion and use that model

for our test evaluation.

So that's very typical to use the parameters with the minimum validation nodes.

Another very useful technique is data augmentation.

So the idea here is to artificially enlarge the data set.

Now you asked, but how?

Well, the idea is that there are transformations on the label which should be invariant.

Let's say you have the image of a cat and you rotate it by 90 degrees, it still shows

a cat.

Obviously, those augmentation techniques have to be done carefully.

So in the right hand example, you can see that a rotation by 180 degrees is probably

not a good way of augmenting because it may switch the label.

So there's very common transformations here, random spatial transforms like affine or elastic

transforms.

There's pixel transforms like changing the resolution, changing noise, or changing pixel

distributions like color, brightness, and so on.

So these are typical augmentation techniques in image processing.

What else?

We can regularize in the loss function.

And here we can see that this essentially leads to maximum a posteriori estimation.

Teil einer Videoserie :

Deep Learning

Presenters

Prof. Dr.-Ing. Andreas Maier

Zugänglich über

Offener Zugang

Dauer

00:14:11 Min

Aufnahmedatum

2020-05-09

Hochgeladen am

2020-05-09 13:56:10

Sprache

en-US

Deep Learning - Regularization Part 2

This video discusses classical regularization techniques such as early stopping using a validation set, augmentation, and maximum a-posteriori methods that expand the loss function using a regularization term.

Video References:
Lex Fridman's Channel

Further Reading:
A gentle Introduction to Deep Learning

Tags

Per RSS abonnieren