Welcome back to deep learning and we are back to a new session where we want to talk about
a couple of exciting topics.
So let's see what I've got for you.
So today I want to start discussing different architectures and in particular in the first
couple of videos I want to talk a bit about the early architectures, the things that we've
seen in the very early days of deep learning and we will follow them by looking into deeper
models in later videos and in the end we want to talk about learning architectures.
A lot of what we'll see in the next couple of slides and videos has of course been developed
for image recognition and object detection tasks.
And in particular two datasets are very important for these kinds of classes.
This is the ImageNet dataset which you find in Reference 11.
It has something like a thousand classes, more than 14 million images and subsets have
been used for the ImageNet large scale visual recognition challenges.
It contains natural images of varying size so a lot of these images have actually been
downloaded from the internet.
There's also a smaller datasets if you don't want to train with like millions of images
right away.
So there's also very important the Cypher datasets, Cypher 10 and Cypher 100 which is
10 and 100 classes and there we only have like 50k training and 10k testing images.
The images have reduced size 32 by 32 in order to very quickly be able to explore different
architectures, pros and cons and if you have these smaller datasets then it also doesn't
take so long for training.
So this is also a very common dataset if you want to evaluate your architecture.
Okay, so based on these different datasets we then want to go ahead and look into the
early architectures and I think one of the most important ones is LeNet which was published
in 1998 in reference number 9 and you can see this is essentially the convolutional
neural network as we have been discussing so far.
It has been used for example for letter recognition.
We have the convolutional layers where we have trainable kernels, then pooling, another
set of convolutional layers and another pooling operation and then towards the end we are
going into fully connected layers where we then gradually reduce and in the very end
we have the output layer that corresponds to the number of classes.
So this is a very typical CNN type of architecture and this kind of approach has been used in
many papers, has inspired a lot of work.
We have for every architecture here key features and you can see here most of the bullets are
in grey.
That means that most of these features did not survive but of course what survived here
was convolution for spatial features.
This is the main idea that is still prevalent.
All the other things like sub-sampling using average pooling, it still used a non-linearity
the tongue and subo-bolicus, so it is a not so deep model.
Then it had a sparse connectivity between S2 and C3 layers as you see here in the figure.
Also not that common anymore.
The multilayer perception as final classifier is also something that we see no longer because
it has been removed for example fully convolutional networks which is a much better approach.
Also the sequence of convolution pooling and non-linearity is kind of fixed and today we
would do that in a much better way.
But of course this architecture is fundamental for many of the further developments and I
think it is really important that we are also listing it here.
The next milestone that I want to talk about in this video is AlexNet.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:15:49 Min
Aufnahmedatum
2020-10-12
Hochgeladen am
2020-10-12 16:36:21
Sprache
en-US
Deep Learning - Architectures Part 1
This video discusses the first early architectures developed in deep learning from LeNet to GooLeNet.
For reminders to watch the new video follow on Twitter or LinkedIn.
Further Reading:
A gentle Introduction to Deep Learning