4 - Deep Learning [ID:9073]
50 von 1434 angezeigt

Wellear

Welcome back everybody to our Deep Learning lecture

and just to put everything into context

so far we've seen that

well, deep learning has made quite

some change in our research so a lot of

things are now possible and can now be achieved

classification results and multi-class problems that people have considered to be extremely hard

can now be solved with rather high accuracy rates.

And we want to look into that a little more and started looking into the background.

And we started with the perceptron, which is essentially just one neuron.

And we've seen that the perceptron is not able to solve, for example,

non-linearly separable problems like the XOR problem.

But then we've seen if we expand this into layers,

then we can start modeling much more complex functions.

And we've seen the so-called universal approximation

theory that already tells us that if we have a neural network with just one

hidden layer, then we are able to model essentially any continuous function.

So this is a very powerful method.

But there's some error bounds.

And this epsilon may be rather high if we don't have enough neurons in there.

And unfortunately, we don't know how many neurons we actually need.

So we've then seen that for some problems, it's rather hard to model this just

with a single layer.

But we've seen already expanding to a second layer

increase the capability of modeling quite a bit.

And we could, for example, map any tree to a two-layer neural network.

So this gave us essentially a motivation for building deeper networks.

And then we've also seen that when we start building those deeper networks,

we also have to consider optimization.

And in particular, we have to train with this back propagation algorithm.

And sometimes this can be numerically unstable.

And there may be problems like vanishing gradients or exploding gradients,

which we will also talk about a little today.

And in particular, we've seen different methods

on how to optimize this gradient descent and how to select the step size.

And there's rather clever algorithms that then automatically

adjust step sizes for different parameters,

depending on how strongly they vary.

And we've seen that we can use momentum to rely

on previous directions of our gradient descent,

such that we can prevent oscillations in our gradient descent procedure.

So now we want to talk a bit more about details of this optimization process.

In particular, we want to talk about activation functions,

because we had only a very coarse view on different activation functions.

So today we want to look into more detail about them.

And we want to talk about convolutional neural networks

and how we can model actually convolution, a very powerful operation

in such a neural network.

And you will see that it's actually not that hard to model

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:19:17 Min

Aufnahmedatum

2018-05-02

Hochgeladen am

2018-05-02 17:59:21

Sprache

en-US

Tags

reconstruction energy deep spatial rule feature pattern exercise layer activation problem analytic forward recognition learning functions function classification
Einbetten
Wordpress FAU Plugin
iFrame
Teilen