13 - Deep Learning - Plain Version 2020 [ID:32424]
50 von 85 angezeigt

Welcome back to deep learning. So in today's lecture we want to talk about activations

and convolutional neural networks. We've split this up into several parts and the first

one will be about classical activation functions. Later we will talk about convolutional neural

networks pooling and the like. So let's start with activation functions and you can see

that the activation functions go back to a biological motivation. We remember that everything

we've been doing so far, we somehow also motivated with the biological realization.

We see that the biological neurons are being connected with synapses to other neurons.

This way they can actually communicate with each other. The synapses have this myelin

sheath and with this they can actually electrically be insulated. They are able to communicate

with other cells. When they are communicating they are not just sending everything that

they get in. They have a selective mechanism. So if you have a stimulus it actually does

not suffice to generate an output signal. The total signal must be above a threshold

and what then happens is that an action potential is triggered. After that it repolarizes and

then returns to the resting state. Interestingly it does not matter how strongly the cell is

activated. It is always returning the same action potential and returns to its resting

state. The actual biological activation is even more

complicated. You have different accents and they are connected to the synapses in other

neurons. On the paths they are covered with Schwann cells that then they can deliver this

action potential towards the next synapse. There are iron channels and they are actually

used to stabilize the entire electrical process and bring the whole thing again into equilibrium

after the activation pulse. So what we can see is the knowledge essentially

lies in the connections between the neurons. We have both inhibitory and excitatory connections.

The synapses anatomically enforce feed forward processing so it is very similar to what we

have seen so far. However those connections can be in any direction. So they can also

form cycles and you have entire networks of neurons that are connected with different

accents in order to form different cognitive functions. Crucial is the sum of activations.

Only if the sum of activations is above the threshold then you will actually end up with

an activation. These activations are electric spikes with a specified intensity and to be

honest the whole system is also time dependent. Hence they also encode the entire information

over time. So it's not just that we have a single event that passes through but the

whole process runs at a certain frequency. This enables the entire processing over time.

Now activations in artificial neural networks so far they were nonlinear activation functions

and mainly motivated by the universal function approximation. So if we don't have the nonlinearities

we can't get a powerful network. Without the nonlinearities we would just end up with

matrix multiplication. So compared to biology we have some sine function that can model

all or nothing responses. Generally our activations have no time component. Maybe this could be

modeled by the activation strength of the sine function. Of course it is also mathematically

undesirable because the derivative of the sine function is zero everywhere except at

zero where we have infinity. So this is absolutely not suited for back propagation. Hence we've

been using the sigmoid function because we can compute an analytic derivative. Now the

question is can we do better? So let's look at some activation functions. The most simple

one that we can think of is a linear activation where we just reproduce the input. We may

want to scale it with some parameter alpha and then the output. If we do so we get a

derivative of alpha. It's very simple and it would render the entire optimization process

into a convex problem. If we don't introduce any nonlinearity we are essentially stuck

with matrix multiplications. As such we only list it here for completeness. It would not

allow you to build deep neural networks as we know them. Now the sigmoid function is

the first one that we started with. It essentially has a saturation towards one and zero. So

it has a probabilistic output that is very nice. However it saturates for x going towards

very large or very low values. You can see here that the derivative already around three

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

00:09:44 Min

Aufnahmedatum

2021-05-06

Hochgeladen am

2021-05-06 16:08:33

Sprache

en-US

Deep Learning - Activations, Convolutions, and Pooling Part 1

This video presents the biological background of activation functions and the classical choices that were used for neural networks.

For reminders to watch the new video follow on Twitter or LinkedIn.

Further Reading:
A gentle Introduction to Deep Learning

Einbetten
Wordpress FAU Plugin
iFrame
Teilen