So for instance, when dealing with the problems of classification of data and supervised learning,
we have shown how one can use residual neural networks in a given dimension D in order to,
you know, layer by layer, make information evolve and succeed on, in particular, classifying
in a successful manner so that eventually these can be later used for generalization, right,
to classify new data that are unknown. So as we have seen, basically, the way we, you know,
the perspective we adopted was that of nonlinear dynamical systems, either continuous in time
or discrete in time governed by sigmoid nonlinearity, right, and manipulated, regulated through
parameters, controls, depending on time that were taken into consideration the possible need of,
you know, shifting the interface of the neural network in the opinion space or to better classify
or also to choose the cutting hyperplanes that will determine what is the moon region and what is
the one that is frozen and then also re-entering the wind along which the moving region will move,
right. And this was done in dimension D, right, with a given, say, width for the network.
But you see in view of the idea, you see that this idea can be easily extrapolated also to the context
of neural networks. Of course, this is a very general idea. Details have to be worked out later,
right. So the analysis might even be complicated. So the analysis of the perfectly matching layers
that was actually introduced for the Maxwell's equations in Elastia, in the system where this
is actually a system of equations is not simple. The idea is quite generic. As we said, given any
system inside this box, any process, any continuous media evolving inside this green box,
we can always set up a little, say, layer around. And in this layer, the equation can be tuned
arbitrarily because after all, right, once we solve the equation in the big domain, we are,
you know, neglecting everything that has been added here artificially and focused
simply on the equation and the solution inside the original domain. So then even if originally,
you know, we have added this artificial term because this artificial term is only active on
the exterior layer, we can focus, right, and restrict the solutions to the original domain
so that we preserve the true equation. So as I said, then the analysis needs to be,
you know, justified. But, you know, computationally, this is something that is actually
extremely useful and not so hard to implement. And one can, of course, soon realize, you know,
impact that such a damping mechanism has on the behavior of solutions. So this idea could be also
implemented in the context of deep neural networks. So how you will do it? I mean, you have to,
if I will ask you, okay, can you extrapolate this here that originates on the wave equation
or wave-like equations by exterior layer of damping material? Could you generate it to the
context of, you know, neural networks in which we are, you know, moving from one, say, layer
to another, to another, right, to another, so to generate a map, right, that's out of,
that's our original problem, right, out of a cloud of very mixed data, right,
is able to really, you know, place
the data corresponding to each label well separated in the arrival space?
How would you extend this idea to that setting? Note that the phenomenon could be similar here
when we are advancing, we are marching on K, which is the pseudo time on our deep neural network or
the real time in case we decide to make it an ordinary differential equation to work with neural
differential equations, right? So, but the phenomena could be similar. So for instance, here,
you know, we know that the wave equation, right, is oscillating inside, but when getting to an
artificial wall, waves will bounce back, generating, you know, phenomena that are spurious.
This is when we decided to continue, right, the, you know, the wave equation outside,
the wave equation outside, let waves get into the other frame, but add the empty, the damping there
so that they never come back, right? Of course, this idea can also be implemented in the context of,
of neural network, right? You could always say, okay, no problem, I add one extra layer here
and one extra layer here, right? I make the network a bit wider. I design the coefficients
for the wide neural network, but then eventually I try to project into, you know, the original width,
right? And in this way, the fact that I am allowed to add an extra layer up and down, right, this will
allow me, for instance, to introduce many dissipative effects here that will have an impact also
Presenters
Zugänglich über
Offener Zugang
Dauer
02:39:01 Min
Aufnahmedatum
2024-07-07
Hochgeladen am
2024-08-07 23:35:23
Sprache
en-US
S10: Momentum Neural ODE and Wave equation with viscous damping
Date: July 2024
Course: Control and Machine Learning
Lecturer: Prof. Enrique Zuazua
_
Check all details at: https://dcn.nat.fau.eu/course-control-machine-learning-zuazua/
TOPICS
S01: Introduction to Control Theory
S02: Introduction: Calculus of Variations, Controllability and Optimal Design
S03: Introduction: Optimization and Perpectives
S04: Finite-dimensional Control Systems (1)
S05: Finite-dimensional Control Systems (2) and Gradient-descent methods (1)
S06: Gradient-descent methods (2), Duality algorithms, and Controllability (1)
S07: Controllability (2)
S08: Neural transport equations and infinite-dimensional control systems
S09: Wave equation control systems
S10: Momentum Neural ODE and Wave equation with viscous damping
S11: Heat and wave equations: Control systems and Turnpike principle (1)
S12: Turnpike principle (2), Deep Neural and Collective-dynamics
_
Check all details at: https://dcn.nat.fau.eu/course-control-machine-learning-zuazua/