7 - Course: Control and Machine Learning [ID:53630]
50 von 1723 angezeigt

Okay, so good morning to all of you again.

The goal of the lecture today, continuing with what we started yesterday, was to try

to explain how one can get this traditional result or results alike on the universal approximation

that, as you may remember, in one of the first lectures we gave a proof based on Hambanach

theorem. This was actually the original proof of Sibenko on the universal approximation,

the fact that a combination of sigmoids translated, dilated, scaled, were able to generate an

approximation of any function. You recall that it was given out of Hambanach, but the

goal here is to explain how it can be achieved in a similar manner, in a different manner,

with a similar conclusion in the context of control systems. As we said yesterday, the

different feature of the neutral differential equations we are considering these days is

that the nonlinearity of this neutral differential equation will be given by these kind of sigmoid

functions, which are not very typical in the context of mechanical systems. Where we encounter

rather polynomials, trigonometric functions and so on. We will fix our attention in particular to

this sigmoid function, which is the global ellipsis, but is not as smooth. As we said,

the most prototypical, the simplest problem we could consider is that in which we are simply

trying to classify data. We said, well, we can reformulate that as a simultaneous control

problem. Why do we say this is a simultaneous control problem? Well, we said, okay, let us

rather than consider one layer, say, neural network as in the original work of Sibenko,

let us consider a neural network with multiple layers and let us do it in an incremental manner.

This is what is called the residual neural network, in which we are moving from our

configuration of the data into a different configuration of the data. From k to k plus

one, we are generating a discrete dynamical system that out of the initial configuration

is mapping the data into a new one in which we expect that they will get classified properly

according to the labels. So we generate a discrete dynamical system in which when h is a small

parameter, we are simply making a small variation of the identity operator out of the sigmoid

function. But because we are allowed to choose the parameters a, b, and w, as we said,

we are enjoying all the possibilities that the sigmoid function allows in the sense that it's

able to freeze half of the space while moving the other one linearly in the case of the ReLU,

moving it linearly in the direction we wish. Then in order to establish even a more clear link with

the theory of differential equations, the dynamics of differential equations and the control of

differential equations, we said if in this time discrete neural network we assume that h is small,

then we are close to the regime of the neural differential equations written here.

So in which you see now that this is a classical non-autonomous scotche problem. So this is a

non-autonomous differential equation because the non-linearity not only depends on the state x,

but also depends on the time variable t. The sigma even when we consider the red line,

when we consider the red loop is a Leipzig's function. So there is no problem in terms of the

application of the Cochille-Leipzig theorem for the existence and uniqueness of a solution for

the Cochille problem. And contrary to the linear case, where we look to the classical

controllability problem in which you give me an initial data, you give me a target, and then

I am supposed to build the control going from one to the other. But in a way that the control

will change whenever you change the initial data and target. Here we are facing a huge

simultaneous control problem in the sense that I am supposed to build these differential equations

so that whenever I take the capital N different initial data

that are to be classified, I consider them as being initial data of this

neural differential equation. The control has to achieve the simultaneous goal of driving each of

them to the corresponding destination. And as I said, in the context of the neural differential

equations, where the time discrete dynamics becomes a time continuous dynamics, now the controls

bt, at, and wt depend on time continuously. So you know that in particularly we consider

controls which are in L1, in L2, in L infinity with respect to time. As I said before,

because sigma is globally leaps, this problem will be well posed. And this is the point of view that

Teil eines Kapitels:
S07 Controllability (2)

Zugänglich über

Offener Zugang

Dauer

02:50:06 Min

Aufnahmedatum

2024-07-07

Hochgeladen am

2024-08-07 23:33:50

Sprache

en-US

S07: Controllability (2)

Date: July 2024
Course: Control and Machine Learning
Lecturer: Prof. Enrique Zuazua

_

Check all details at: https://dcn.nat.fau.eu/course-control-machine-learning-zuazua/

TOPICS

S01: Introduction to Control Theory

S02: Introduction: Calculus of Variations, Controllability and Optimal Design

S03: Introduction: Optimization and Perpectives

S04: Finite-dimensional Control Systems (1)

S05: Finite-dimensional Control Systems (2) and Gradient-descent methods (1)

S06: Gradient-descent methods (2), Duality algorithms, and Controllability (1)

S07: Controllability (2)

S08: Neural transport equations and infinite-dimensional control systems

S09: Wave equation control systems

S10: Momentum Neural ODE and Wave equation with viscous damping

S11: Heat and wave equations: Control systems and Turnpike principle (1)

S12: Turnpike principle (2), Deep Neural and Collective-dynamics

_

Check all details at: https://dcn.nat.fau.eu/course-control-machine-learning-zuazua/

Tags

FAU control mathematics machine learning Mathematik Applied Mathematics Turnpike control theory FAU MoD FAU DCN-AvH Chair for Dynamics, Control, Machine Learning and Numerics (AvH Professorship)
Einbetten
Wordpress FAU Plugin
iFrame
Teilen