16 - Deep Learning - Activations, Convolutions, and Pooling Part 4 [ID:16886]
50 von 92 angezeigt

Welcome back to deep learning. So today I want to talk to you about the actual

pooling implementation. And the pooling layers are one essential step in many

deep networks. The main idea behind this is that you want to reduce the

dimensionality across the spatial domain. So here we see this small example

where we summarize the information in the green rectangles, the blue rectangles,

the yellow and the red ones to only one value. So we have a two-by-two input that

has to be mapped onto a single value. Now this of course reduces the number of

parameters required, it introduces some hierarchy and allows you to work with

spatial abstraction and it reduces computational costs and overfitting, but

we need some basic assumptions of course here. And one of the assumptions is that

the features are hierarchically structured, that by pooling we're

reducing the output size and introduce this hierarchy that should be

intrinsically present in the signal. We talked about this, the eyes that are

composed of edges and lines and then faces are composed of eyes, nose and so

on. This has to be present in order to make pooling a sensible operation to be

encoded into your network. So here you see a pooling of a three-by-three layer

and here we choose max pooling. So in max pooling only the highest number in the

receptive field will actually be propagated into the output. Obviously we

can also work with striding and the stride typically equals the neighborhood

size such that we get a reduced output dimension. One problem here is of course

that the maximum propagation adds an additional non-linearity and therefore

we also have to think about how to resolve this step in the gradient

procedure and what we do is we introduce essentially again a sub gradient concept

where we simply propagate into the cell that has produced the maximum output. So

you could say the winner takes it all. Now an alternative to this is average

pooling and here we compute simply the average over the neighborhood. It does

not consistently perform better than max pooling and then of course in the back

propagation path the error is simply shared in equal parts and back

propagated to the respective units. There are many more pooling strategies like

fractional max pooling, LP pooling, stochastic pooling, spatial pyramid

pooling, generalized pooling. There's a whole different set of strategies about

this. One alternative that we already talked about is strided convolution.

This became really popular because then you don't have to encode the max pooling

as an additional step and you reduce the number of parameters. Typically people

now use strided convolutions with s greater than 1 in order to implement

convolution pooling at the same time. So let's recap how our convolutional

networks are doing. So we talked about the convolution producing feature maps,

the pooling reducing the size of the respective feature maps, then again

convolutions, again pooling until we end up at an abstract representation and

then we had these fully connected layers in order to do the classification.

Well actually we can kick out this last block because we've seen that if we

replace this with a reformatting into a channel direction and then we can

replace it with a one-by-one convolution and just apply this in order to get our

final classification. So we reduce the number of building blocks further. We

don't even need our fully connected layers here anymore. Now everything then

becomes fully convolutional and we can express essentially the entire chain of

operations by convolutions and pooling steps. So we don't even need the fully

connected layers anymore. The nice thing about using the one-by-one convolutions

is if you combine this with something that is called global average pooling

then you can essentially also process input images of arbitrary size. So the

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

00:08:48 Min

Aufnahmedatum

2020-05-30

Hochgeladen am

2020-05-30 20:16:49

Sprache

en-US

Deep Learning - Activations, Convolutions, and Pooling Part 4

This video presents max and average pooling, introduces the concept of fully convolutional networks, and hints on how this is used to build deep networks.

References:
[1] I. J. Goodfellow, D. Warde-Farley, M. Mirza, et al. “Maxout Networks”. In: ArXiv e-prints (Feb. 2013). arXiv: 1302.4389 [stat.ML].
[2] Kaiming He, Xiangyu Zhang, Shaoqing Ren, et al. “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification”. In: CoRR abs/1502.01852 (2015). arXiv: 1502.01852.
[3] Günter Klambauer, Thomas Unterthiner, Andreas Mayr, et al. “Self-Normalizing Neural Networks”. In: Advances in Neural Information Processing Systems (NIPS). Vol. abs/1706.02515. 2017. arXiv: 1706.02515.
[4] Min Lin, Qiang Chen, and Shuicheng Yan. “Network In Network”. In: CoRR abs/1312.4400 (2013). arXiv: 1312.4400.
[5] Andrew L. Maas, Awni Y. Hannun, and Andrew Y. Ng. “Rectifier Nonlinearities Improve Neural Network Acoustic Models”. In: Proc. ICML. Vol. 30. 1. 2013.
[6] Prajit Ramachandran, Barret Zoph, and Quoc V. Le. “Searching for Activation Functions”. In: CoRR abs/1710.05941 (2017). arXiv: 1710.05941.
[7] Stefan Elfwing, Eiji Uchibe, and Kenji Doya. “Sigmoid-weighted linear units for neural network function approximation in reinforcement learning”. In: arXiv preprint arXiv:1702.03118 (2017).
[8] Christian Szegedy, Wei Liu, Yangqing Jia, et al. “Going Deeper with Convolutions”. In: CoRR abs/1409.4842 (2014). arXiv: 1409.4842.

Further Reading:
A gentle Introduction to Deep Learning

Tags

Perceptron Introduction artificial intelligence deep learning machine learning pattern recognition
Einbetten
Wordpress FAU Plugin
iFrame
Teilen