5 - Deep Learning [ID:11461]
50 von 913 angezeigt

Good morning everybody.

Great to see everybody assembled here.

And this means that we can continue with our lecture on deep learning.

And today we want to talk about a couple of methods that aim at regularization so that

we can somehow restrict the learning and get rid of a couple of problems that we already

identified earlier.

So okay, so let's see what we have here.

So we have a short introduction to regularization, then some of the classical techniques, in

particular regularization and the loss function, then some normalization techniques, an important

one that is called dropout that we will introduce towards the second part of this lecture, then

things about initialization, how to choose the proper distributions for the weights for

initialization, some transfer learning, and also multitask learning, which is a popular

choice right now, and there's plenty of papers to show the effectiveness of multitask learning.

Okay, so let's introduce this a little, the regularization, and essentially what we are

doing here is we have some discrete data that is observed from some unknown distribution

of classes that is underlying, and we observe some data.

And if we have a case like here, that's actually quite nice because this one is even linearly

separable, and we would be able to find a classifier that is able to differentiate the

two classes quite easily.

But now what often happens is that your data doesn't reveal the structure of the classes

that nicely, and you get cases where you can't just find a decision boundary that easily.

So the fitting may be difficult because the distribution isn't that clear.

So this gives them rise to a couple of solutions.

So for example, you could choose a boundary that is very rigid, like a line, and I can

still separate those two classes just picking a line and then, let's say, making some sacrifices

where you introduce some error.

But the big advantage of this kind of approach is that you do not overfit to your data.

Well, but this class may be just too rigid, right?

It may not be an ideal decision boundary.

So the other approach that you could do is you could take a very flexible decision boundary.

So let's take a decision boundary that allows us to model going back and forth, and that

this allows a lot of curves.

Now if you do this, then you end up in the problem of overfitting.

So in the simplest case, you could just memorize your entire training data, and if you memorize

the entire training data, then you also get a zero loss because you simply assign to all

of the observations the correct class.

This case here is still using a decision boundary that can be described as a line, but still

you can see that this is probably not such a great solution.

Although it classifies your training data set very accurately, you may have trouble

if you take a classification boundary like this one and sample data anew because then

it may simply be inaccurate.

So this is a decision boundary that introduces a high variation, a high variance, and this

is a decision boundary that introduces a high bias.

So it has very few parameters.

If you sample the data again, you are likely to observe a similar decision boundary.

So the variance in parameters is not that great.

It's very rigid, but it introduces a lot of error.

So this one has a high bias, so this has a high error, and this one has a high variance.

So the error on the training data set is zero in this case, but still the decision boundary

is not that trustworthy because you simply have very high variations, and probably what

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:13:55 Min

Aufnahmedatum

2019-05-23

Hochgeladen am

2019-05-23 21:39:04

Sprache

en-US

Deep Learning (DL) has attracted much interest in a wide range of applications such as image recognition, speech recognition and artificial intelligence, both from academia and industry. This lecture introduces the core elements of neural networks and deep learning, it comprises:

  • (multilayer) perceptron, backpropagation, fully connected neural networks

  • loss functions and optimization strategies

  • convolutional neural networks (CNNs)

  • activation functions

  • regularization strategies

  • common practices for training and evaluating neural networks

  • visualization of networks and results

  • common architectures, such as LeNet, Alexnet, VGG, GoogleNet

  • recurrent neural networks (RNN, TBPTT, LSTM, GRU)

  • deep reinforcement learning

  • unsupervised learning (autoencoder, RBM, DBM, VAE)

  • generative adversarial networks (GANs)

  • weakly supervised learning

  • applications of deep learning (segmentation, object detection, speech recognition, ...)

Einbetten
Wordpress FAU Plugin
iFrame
Teilen