Welcome back everybody.
Today's lecture we will discuss different architectures for deep learning.
So this is now the part where we really start looking into the deep and very deep networks
and I hope you will enjoy this lecture.
So we will discuss today the early architectures, so the entire subdivision of today's lecture
will be slightly historical.
We'll look into the early architectures, then into the deeper models, discuss some of the
problems that may emerge if you go deeper and in the end we will also look into how
to generate architectures more or less automatically.
So in many cases you need of course training data and just to name a few that are very
typically used in the development and evaluation of architectures.
Of course the ImageNet data set, this one popped up already at several occasions in
this lecture.
We have a thousand classes, 40 million images and this has also been used in subsets for
ImageNet large-scale visual recognition challenges and of course it contains natural images of
varying size.
And what's also popular is a data set called CIFAR 10, CIFAR 100, which is essentially
a data set that is either 10 or 100 classes and you can see that it's considerably smaller
and the reason why you want to work sometimes with smaller data sets is of course because
training time is much quicker and if you want to evaluate different ideas you can start
working with a smaller data set before you go to the full data set and in particular
interesting in this data set is that it has very small images.
So if you have some ideas on general classification systems it will provide 32 by 32 images.
So there's different data sets for different purposes and we will also see towards the
end of this lecture that essentially those data sets, they are essentially now being
solved at a level where you essentially approach the label quality.
Okay, so let's look a bit into the early architectures.
One of the first ones that we want to mention is LONET, this is already published in 1998
and one of the key features that have been introduced in this network are convolutions
for the spatial features and maybe you see this figure actually quite often or a lot
of figures derived from this because it introduces those convolutional layers that introduce
this weight sharing for building the circular matrix and this is the main feature that is
still used today.
The other things that have been implemented in LONET are not so commonly used anymore
so it's using sub-sampling, using average pooling, non-linearity is tangents with bollikos
so we now often use rectified linear units.
Then there is a sparse connectivity between the early layers here and in the end you have
a multilayer perceptron as the classifier.
So this mimics very typical feature extraction approaches where you have convolutions and
then the sub-sampling here and in the end the fully connected layers.
So still this kind of network is a kind of landmark network because it was the first
one that introduced those convolution and pooling operations.
So we like to mention this paper here and note that we have bullets and the bullets
have coloring and you see here in orange that this kind of idea, this kind of technique
is still being used quite frequently.
And you can see now that often those networks introduce many ideas when they first suggest
an architecture but one of the features is key for its good performance and we try to
highlight this now in orange color.
So LONET is not really a deep network.
It only has a couple of layers stacked on top so you could argue it's not a very deep
Presenters
Zugänglich über
Offener Zugang
Dauer
01:03:29 Min
Aufnahmedatum
2019-12-03
Hochgeladen am
2019-12-03 16:19:02
Sprache
en-US