In the last lecture, we have introduced an important idea, is the concept of tarpike
that I remind you, it emphasizes the fact that when the time horizon is very long, very
often the optimal strategy is essentially of a steady state nature, so that when we are moving
from one place to another, when we are controlling a system from an inertial state to a final
destination in a long time horizon, the best we can do is to go from the initial datum to the
optimal steady state, stay in the steady state and then jump off again. This is like if you go from
your university in Chuncheon to the Academy of Sciences in Beijing, probably what you will
do is to take the fast train that will take two or three hours and then the control will be
moving from your office to the train station in Chuncheon. So there is an inertial arc which is
active, this is what we are highlighting here, there is an inertial arc in which the control
system has to be very active to get into a short time, the steady control will be the
fast train where you will sit for three hours, the train will drive you close to the final
destination, not exactly to the final destination because you will arrive to Beijing train station
and then you will jump. This was the key observations we made and that in practical
applications what it means is that if you are working in long time intervals, rather than
computing the control in the whole long time interval which would be very expensive, because
you remember in order to control an optimal control, applying a gradient descent algorithm
we have to implement a numerical methodology that was iterative, in which we first solve the
adjoint equation, take the control that the adjoint equation gives, plug it into the state equation,
then solve the forward state equation, check the terminal condition, compute the residual,
meaning the distance of the solution to the target that you want to minimize or to make it zero,
and with this residual adjust the solution of the adjoint equation and then solve the backwards
adjoint equation to then plug it again into the forward equation and so on. Therefore you are
solving in each step of the gradient iteration the adjoint equation and the state equation fully
in the whole time horizon. When the time horizon is very long this will be computationally very
expensive and in particular it will lead to errors that are substantial because of the fact
that numerical algorithms converge, but you have after all an estimate of global time on the
convergence of the numerical scheme which grows when the time interval is very long
and that will oblige you in long time ornithons to take the time step to be extremely narrow,
but if the time horizon is long and in the time step in the numerical algorithm is very very narrow
then the cost of solving each of these equations, the adjoint backward and the state forward
will have a very substantial computational cost. In practice this could be even impossible for
say large dimensional control systems appearing in application. This is why
the philosophy that emerges out of the d star pi principle is that before
rather than controlling the system actively in a time-dependent manner you first compute
the steady state optimal control in which the time has been chopped off. You implement this
steady optimal control in most of the time intervals and then you simply solve two short time
controllability problems in the beginning and in the end just to make sure that you are linking
the initial state to the steady state and then eventually the steady state into the final target.
So this is the tarpike philosophy that emerges out of this analysis. Of course this only makes
sense when time is very long because as I said when time is very short you need to act in a very
singular manner with either very large or oscillatory controls or with direct delta
controls, impulsional controls, but there is no room to take this such tarpike path.
But oftentimes in applications as I said the time that we are given is long,
is long for the process under consideration and therefore this tarpike strategy is the one
that is the most convenient one. Now when are these controls obtained through the
tarpike principle optimal? The important remark we made was that these controls are typically
optimal when you consider controls that are both optimal in the sense that they are of a small norm
but also the whole trajectory is penalized. And as I said in principle you could say that the whole
trajectory is not penalized because you say okay I wanted to go from Chanchun to Beijing,
Presenters
Zugänglich über
Offener Zugang
Dauer
02:26:18 Min
Aufnahmedatum
2024-07-07
Hochgeladen am
2024-08-07 23:35:56
Sprache
en-US
S12: Turnpike principle (2), Deep Neural and Collective-dynamics
Date: July 2024
Course: Control and Machine Learning
Lecturer: Prof. Enrique Zuazua
_
Check all details at: https://dcn.nat.fau.eu/course-control-machine-learning-zuazua/
TOPICS
S01: Introduction to Control Theory
S02: Introduction: Calculus of Variations, Controllability and Optimal Design
S03: Introduction: Optimization and Perpectives
S04: Finite-dimensional Control Systems (1)
S05: Finite-dimensional Control Systems (2) and Gradient-descent methods (1)
S06: Gradient-descent methods (2), Duality algorithms, and Controllability (1)
S07: Controllability (2)
S08: Neural transport equations and infinite-dimensional control systems
S09: Wave equation control systems
S10: Momentum Neural ODE and Wave equation with viscous damping
S11: Heat and wave equations: Control systems and Turnpike principle (1)
S12: Turnpike principle (2), Deep Neural and Collective-dynamics
_
Check all details at: https://dcn.nat.fau.eu/course-control-machine-learning-zuazua/