14 - Neural Differential Equations, Control and Machine Learning [ID:32784]

50 von 235 angezeigt

Okay, thank you. Thank you. Thank you. Thank you, Daniel for this very kind invitation.

Yeah. So this is more. I mean, I prepared this material most more as I say, possible basis of a discussion rather than a lecture actually I myself I try to learn about, you know, so many things that probably I missed for too many years right so let me, let me first

remind you some very basics about control right. And I do it precisely now from an historical perspective a little bit because I like very much this quote in probably you have seen me saying the same thing in some other in some other meetings,

there is a beautiful book by Haydn and Warner, his analysis by its history.

This is Haydn father right.

So in this book, this is kind of a calculus book where where they explain how calculus was developed from ancient times to, you know, nowadays right so you know how people were, you know, computing developing all kind of geometric and analytical

methods for instance in order to compute say decimals of the number pi. And it's an amazing book right with even illustrations taken from the original documents and so on right and in the front page or in the, you know, in the, in the introduction to this book.

They, they make the following statement, which I found, I mean I identify myself very much with that because I remember as a, as a bachelor student when I go to the faculty in Bilbao the first day right, it was back in 79.

I remember the professor of analysis course, I mean calculus, he started the lecture talking about dedicated cats right and I remember that immediately after the stock I call my, my older brother saying, Alberto I think I, I got the wrong election

and the university should change, because I didn't understand a single word right has nothing to do with the kind of mathematics I was kind of expecting at university that was very much related to the to the classical calculus problems for combinatorial

problems we're used to doing high school. And then the book explains why that happens is the book says well, we used to teach analysis from person towards the past in a backward sense right so we introduce, you know, the theory of sets logic and so on,

to later explain what the limit is, of course, you know limit epsilon delta, and then once you have the limit you have the continuity and the derivatives and the chapter of integration comes at last.

And then the book explains why, you know, all this is so unnatural in you know the first, you know the first time you see that way, because actually this is the opposite sense in which mathematics was developed right so of course there was integrals

integrals were much before derivatives and derivatives were much before you know there was a rigorous limit, I mean a rigorous definition of limits right. Okay, so in that sense recently.

I also found these these very interesting article by Alexander fratkoff back in 2020 where he talks about the early story of machine learning.

And there you see how, at least from the perspective of fratkoff.

There are other possible readings about what you know what was somehow the origin of how machine learning started, but you see that it is indeed the, you know, the initial steps of machine learnings are intimately related to control theory.

And the cybernetic was introduced by the French physicist and bear back, you know, long ago, and was forgotten for a while, and turn over winner wrote his celebrated book, and he defined the, you know, the theory of cybernetics has been the science

of control and communication in animals and machines right so and then you see that are here to interesting associations one is control and communication meaning actuators and sensors, and the other ones is animals and machines right so by now.

Of course when we talk about cybernetics maybe we might think on some kind of robots or so right from a control perspective, maybe just to give you a very brief introduction of how problems can look and what kind of answers one can expect from a mathematical

perspective this is probably one of the most safe fundamental and easy results to explain is the case where you have a dynamical system linear finding emotional dynamical system so you have a vector X of announce X one up to X n, and is the number of components

of the system in my immediate very large so it could be the temperature in every office of the finished client building and you know in college trust in our university.

Okay, so it could be a very large vector and there is a dynamics building. I mean this is a spontaneous say transfer and diffusion of heat within the building.

So we are regulating this say multi component temperature vector through some heating devices some controls. And normally of course UT, you know the number of controls will be much less than the number of components of the system, it will not be very,

very ambitious to try to control the system using as many components for the controls as components are on the system right so typically the control you will be of a number of components, small m, small m much less than, you know, the number of components

in the building, it could even be that you know you have just one control in order to control a system of dimensional 1000 and controlling means simply whether you can drive the system with a suitable choice of a controller right, you know the

time device right time dependent control from X zero initial configuration to a final configuration it could be say 19 degrees temperature in every room in the building, a constant say kind of a steady vector as a target, but in general,

you know the problem of control is just whether you are able to drive the system from any initial configuration to any terminal one.

And the answer that Carmen gave right back in the 50s was, you can do that, if and only if you know when you compute the rank of this column matrix B, AB up to a n minus one B is equal to n.

So it's a purely algebraic problem.

If the answer to this problem is yes, so if the rank of this matrix is n, then you can control the system and you can do it in any time. Of course if time is very small, it will take you know a very, you know, very large size controls.

If time is longer you can expect the controls to be milder right.

Whether you can do it or not, it only depends on this purely algebraic problem. So you see for instance, when you are using just one control one is color ut is just an intensity of some say actuator, you can only regulate intensity of an actuator ut

is just an scalar function. So capital B is a column vector. And then AB, AB in a matrix n by n, AB is also another column vector. So you have to choose this control device B in a strategic manner, because for the rank of this matrix to be n,

usually this matrix in that extreme case in which you have only one control is an n by n matrix. So you have only one chance. The determinant of this big matrix has to be different from zero.

But when you have for instance two controls, then B has two columns. In this matrix you have two n columns, and therefore you have many more chances right because the rank being n simply means that you know

choosing n out of these two n you are able to, you know, to make the rank full. So what is this doing? What is this B, AB doing? Well in fact,

here we are representing what actually happens in control systems, right.

This is what we do all the time right so we are doing and undoing the same kind of operations. But actually we do that in a smart manner so that the arrival point is not necessarily the same.

I mean these that in many cases will be considered as an error in the face of control is precisely the, you know, the opportunity we try to use in order to drive the system from an initial configuration to a final one.

And the proof can be easily seen in the variation of constant formula. You say, what is the solution of the OBE out of X0 driven by the matrix A with a control U by the variation of constant formula.

I mean this is the free dynamics, there is nothing you can do.

Given X0 given the dynamics A, this will lead you to some point, but you are lucky enough to be able to manipulate the control.

When you expand these exponential matrix in the variation of constant formula, you see you know in power series you will see the powers of A appearing, right.

And therefore, because of these powers of A, and you use the Keile-Hamilton theorem, right, you know by Keile-Hamilton that all powers of A up to power N are all linear combination of the previous powers.

And this means that in this, say, integral term here, only the first N minus one power should appear. And this is why in the Kalman rank condition, things can be written that way, you know, you can add here more terms, but you know actually by

Keile-Hamilton that you are not really adding to the rank. So whatever you can do with the system, you can see it in this final dimensional matrix.

And this is why, for instance, this is explained in this beautiful book by Sontag, one of the classicals in control theory.

He refers to this Nelson car. I don't know who Nelson was, but this is a very simplified car, which is a rectangle, right, which is just, you know, a four dimensional parameter system.

You have two parameters for the center of gravity, one for the angle of the main axis, and then one for the angle of the front wheels with respect to the main axis. So these are, this is a four dimensional system, and you have only two controls.

Teil einer Videoserie :

AG Mathematics of Deep Learning

Zugänglich über

Offener Zugang

Dauer

00:57:53 Min

Aufnahmedatum

2021-05-11

Hochgeladen am

2021-05-13 13:37:50

Sprache

en-US

Enrique Zuazua (FAU Erlangen-Nürnberg) on "Neural Differential Equations, Control and Machine Learning"

We discuss Neural Ordinary Differential Equations (NODEs) from a control theoretical perspective to address some of the main challenges in Machine Learning and, in particular, data classification and Universal Approximation. More precisely, we adopt the perspective of the simultaneous control of systems of NODEs. For instance, in the context of classification, each item to be classified corresponds to a different initial datum for the Cauchy problem of the NODE. And all the solutions corresponding the data under consideration need to be driven to the corresponding target by means of the same control. We present a genuinely nonlinear and constructive method, allowing to estimate the complexity of the control strategies we develop. The very nonlinear nature of the activation functions governing the nonlinear dynamics of NODEs under consideration plays a key role. It allows deforming half of the phase space while the other half remains invariant, a property that classical models in mechanics do not fulfill. This very property allows to build elementary controls inducing specific dynamics and transformations whose concatenation, along with properly chosen hyperplanes, allows achieving our goals in finitely many steps. We also present the counterparts in the context of the control of neural transport equations, establishing a link between optimal transport and deep neural networks.

This is a joint work Domènec Ruiz-Balet.

Tags

Per RSS abonnieren