3 - Mathematical Basics of Artificial Intelligence, Neural Networks and Data Analytics II [ID:41373]

50 von 725 angezeigt

Let's assume that the solution that we found before was not good enough.

So is there only the chance to give up or can we do something else?

And this something else that you can do there is you can go in the direction of deep feedforward

neural networks.

So why deep?

Because in itself it looks like a contradiction. You know that with one hidden layer here you

can do everything. So why should I take several hidden layers to come to an output?

That in principle with one hidden layer you can do everything does not mean that this

is the optimal solution for your input output problem.

But indeed it's more complicated to learn because look in the forward pass here you

go from an input vector to something else then to something else something else something

else which means the relationship between input and output then is very indirect.

So what can you do against that this indirectness in the forward and in the backward pass is

not showing as crazy solutions at the end.

What you have to do is you have to rearrange your deep neural network in a way so that

every hidden layer knows I have to do something good for the final solution.

And the way to do this and this is an architectural analogy to the algorithm from Freuntern-Sharpia

as the point here from inputs through a hidden layer through an output to an output this

is the normal feed forward neural network in principle it can do everything.

But now let's take another hidden layer which is above of the first hidden layer so that

we do not miss information which goes through here from here through here we can do a direct

connection from the input to the next hidden layer here.

And then so that this thing here is also learning something now we have a second time or a target

value here but how can I couple the different layers so that they really help each other

and the way to do so is to say now the first hidden layer has learned more or less the

solution and then the solution is given as a setup to the next layer which means the

output which is coming from this and this side this is a sum the sum of the of the values

coming from here and coming from here.

This means the output generated here is a superposition of the offset coming from here

and the new information coming from here and hopefully this thing then is able to give

us a better solution than the original output description here.

So the idea is that step by step we have the chance that the higher levels here have to

learn only a residual error the residual error and not focus on the same thing that the original

level has done. See this thing here does not see the same deviation between output and

target because of the output is better than downside here.

Here the output is only the computation coming from here and then output target is giving

you an error here but in this case here the output should be better and the residual between

output and target here is only a small error which is going down here to give you step

by step a more sophisticated explanation of the final output.

And if you do so yeah why is this a better solution than our universal approximation

statement with a very large hidden layer you can explain everything there then you have

to have a very large hidden layer to satisfy the universal approximation theorem here.

And if you have a very large hidden layer and something's changing here then in parallel

something has to change on the other side to keep it all together set in a superposition

your output is explained but here you have the chance to say okay let's try it and if

it's not too good so let's give the responsibility to next layer next layer and so on so that

this is more a sequential type of the learning then.

Now here you see such a thing in the software here the input is going to all the different

hidden layers all the different hidden layers have their outputs the outputs are stacked

on each other and then here you have the final output then.

Teil einer Videoserie :

Mathematical Basics of Artificial Intelligence, Neural Networks and Data Analytics II

Presenters

Dr. Hans Georg Zimmermann

Zugänglich über

Offener Zugang

Dauer

01:32:34 Min

Aufnahmedatum

2022-04-19

Hochgeladen am

2022-04-19 22:06:05

Sprache

en-US

Einbetten

Wordpress FAU Plugin

 https://www.fau.tv/clip/id/41373

iFrame

<iframe src="https://api.video.uni-erlangen.de/services/oembed/?url=https://www.fau.tv/clip/id/41373&format=iframe&maxwidth=1280&maxheight=720" width="1280" height="720"seamless allowfullscreen style="border: 0; padding: 0; margin: 0;overflow: hidden;"></iframe>

Herunterladen

Video

Per RSS abonnieren