So welcome back everybody to our lecture Deep Learning and today we want to talk about architectures.
Today is the lecture where we actually go deep.
So previously we essentially learned all the ingredients and today we really want to look into the deep architectures.
And note we are talking about architectures and the main focus of these applications is classification.
So this is a perceptual task where you try to identify something from underlying data.
And you have no prior knowledge whatsoever how to construct the network or how this perceptual task can be solved best.
Okay, so today we want to talk first about the early architectures which essentially gave rise to deep learning architectures to really deep ones.
And then we want to talk about the really popular deeper models.
And in the end we want to give essentially an outlook.
Is there also a way? Because these are essentially heuristics, nice ideas that people came up with that really work well.
And there is a new architecture being published almost every day.
So remember that this lecture cannot describe all of the possible architectures.
And probably two days after the lecture it may already be outdated because there is a new architecture showing up.
So what we want to show in this lecture is essentially the main ingredients how to build deep models.
But there is really many, many papers on different architectures appearing.
And obviously there is also many degrees of freedom how to do it.
And as you don't have any prior knowledge about the task you essentially have to guess.
You have to find some way to do that.
And one possible way of tackling that is shown in the end of this lecture where we actually want to look into learning architectures.
And obviously this is even an order of magnitude more complex than just training the models.
So today we are already using a cluster to train models for a week or a month.
But if you now imagine if you start learning the architectures it has to include the entire training process essentially for every sample that you want to train with.
Okay, so let's look to the typical data sets that are used for object detection.
One very popular one is ImageNet. You've seen that already.
It has more than 40 million images. It has about a thousand classes.
And the subsets are used typically in large scale visual recognition challenges.
So you can see some examples here.
There's classes Airplane, Automobile, and you have to assign every image a class.
And they have natural images of varying size.
There is a simplified, a smaller data set, the CIFAR data set, 10 and 100 that has 10 or 100 classes.
And this only has 50,000 training samples and 10,000 testing samples.
And those images are essentially thumbnails.
So they are 32 by 32 images and you can use them to train networks more quickly to get an estimate on how the performance of a particular network is doing on this task.
So they are very small.
Okay, so let's look at some early architectures.
And one of the networks that made it pretty far is the LeNet.
And you see the LeNet 5 here that is already published in 1998.
And this is not so deep.
So you have essentially convolutions, feature maps, then subsampling, again convolutions, feature maps that are pooled.
And then you have fully connected layer here in the end.
So what you want to show in such a network or demonstrate is that you have some hierarchy of convolutions.
And you use these pooling layers in order to increase the receptive field of your classifier.
And so this was already quite successful.
So it also had convolution for spatial features.
And we have denoted those key features here in orange and with normal bullets here.
And the orange one indicates that this technique is still used in current architectures.
And this is also the highlight of this network that is introducing those convolution for spatial features.
It had subsampling using average pooling.
So average pooling is not so popular anymore.
As nonlinearity there was still TANH.
Presenters
Zugänglich über
Offener Zugang
Dauer
01:07:27 Min
Aufnahmedatum
2018-05-23
Hochgeladen am
2018-05-23 15:39:03
Sprache
en-US