So now we want to do a short reminder of the lecture from the winter semester and there
the focus was feedforward neural networks, which means in the next minutes we will speak
about feedforward neural networks.
And if we do so, the summary of the mathematics for the feedforward neural networks I have
done on this slide here.
On the one side we want to study complex systems with many variables, on the other side we
want to have nonlinearities in it.
And the neural networks are a function class which is different organized than the Taylor
expansion stuff, which means in Taylor expansion you always have a sum of more and more and
more complicated terms.
Here you have the concatenation of functions.
We have linear algebra to study the interaction of variables, then a nonlinear function and
then the outcome, the outcome in vector here is filled in next matrix calculation and so
on and so on here.
So the trick is you need the nonlinearities here in between because otherwise you could
multiply out all these matrices here and you would have only linear algebra.
So linear algebra alone is not able to move away from linear algebra, but if you have
the intermediate nonlinearities here, then you are in a different range of complexity.
And 1989 Horn-Extinge comment White could prove that if you have only two times the
linear algebra and one nonlinearity in between, that's enough to have a general purpose approximation
model for continuous function on a compact domain.
Now, that's the design part of the neural network description.
What is the algorithmic part of it?
You have matrices in between here and so you need a way to identify the parameters in the
matrices here.
And this is the error backpropagation algorithm, which I want to explain in the following way.
Let's simplify the equation here by neglecting the constant vectors here so that you have
a simplified concatenation of linear algebra and nonlinearities in between.
So let's take the most simple case here, one matrix, the nonlinearity, another matrix here
and then the output vector.
And such things you always can describe in form of architecture.
And this architectural thought is even more important in the real lecture in this summer
term here.
But let's go on.
Whenever I have a vector inside here, I show it in form of an ellipsoid or of a circle.
Here, here, here.
Whenever I have a matrix multiplication, I show it as a line here, so as an arrow between
the ellipsoids.
Input vector, matrix multiplication, the outcome here is a vector and the application of the
nonlinearities here is element by element on the upcoming elements from the linear algebra
here, which means it's a vector computation.
So therefore, it's inside of this ellipsoid here.
And then you have the second time the matrix multiplication and then you have an output
vector.
And if you have random numbers, w1, w2 here, if you fill in an input, the upcoming output,
you will have nothing to do with the output data that you want to have on this side.
So therefore, you have a deviation between output and target.
And so therefore, you have to take this information and you have to compute it back through the
whole network architecture to show how much you have to change the parameters inside of
this matrix and inside of this matrix here.
Presenters
Zugänglich über
Offener Zugang
Dauer
01:35:21 Min
Aufnahmedatum
2022-04-19
Hochgeladen am
2022-04-19 22:36:04
Sprache
en-US