Hi, we have to follow up with something that we have, well not forgotten but delayed, which
is the proof of theorem 3.4, which was, just a reminder, our main result about the minimum
norm solution. So let A be a matrix, m times n, and we write A in terms of its SVD, and
then the minimum norm solution, U m n f of A U is equal to f, well that's, that's the
minimum, maybe move this a bit, minimum solution that is given by, oh, okay, given by U m n
f is equal to A plus f, where A plus is the pseudo inverse, defined in the following way,
V times sigma plus times U transposed times f, and sigma plus was this diagonal matrix,
where we invert all those diagonal elements up to the point where they are still positive,
and then zeros afterwards, and then we fill this, I mean this diagonal matrix is always
a square matrix, but we make it fit the corresponding dimensions by padding with zeros at, either
by appending zero columns or zero rows. At this point, if you don't remember what the
minimum norm solution is, just pause and go back one or two weeks to the lecture where
we discussed this, so the main idea is that this may not always be possible, so maybe
we can't find a U which exactly matches this, but we can look for least square solutions,
and these are not unique, and of this choice of these square solutions, we pick the element
U with minimum norm, and that's called the minimum norm solution, so it's the least square's
approximation to a solution, and has minimum norm, and the statement of this theorem is
that we can explicitly calculate this minimum norm solution in this way. The proof goes
like this, first we write this matrix V in terms of its columns, so V1, Vn, where the
Vi are an orthonormal basis, this has to be true because V is an orthogonal matrix, and
an orthogonal matrix always has a property that its columns and its rows are orthonormal
vectors, and because it's a basis of Rn, then every U can be written as a sum of coefficients
times Vi, and the idea here is to make this ansatz to write U in terms of these basis
vectors Vi, and look for the correct coefficients alpha i instead, so looking for the right
vector U amounts to finding the correct combination coefficients alpha i, this is equivalent, because
the Vi are basis, so the goal is here, find alpha i such that U is the minimum norm solution.
The next step is to write G as U transposed F, this is basically nothing but looking at
this term and making it nicer, so we can always write the data F in this form, because U is
an invertible matrix, and U transposed also is an invertible matrix, so we don't break
anything, we just multiply the data by a suitable invertible matrix, this allows to some other
basis representation, it doesn't really matter, it's just a quicker way of writing the data,
so nothing really happens, it's just a reformulation of the same thing.
Now what is a minimum norm solution, first we have to find alpha i such that it is a
least square solution, and then we try to find the unique choice such that it has a
minimum norm, so let's first look at the first requisite, which is we want U to minimize
this misfit function, and this we can write, well A is U times sigma times V transposed,
times well U is in this form, and we can also write this as V, the matrix here, times a
vector A, where A is the vector with entries alpha 1 to alpha n, and this we can then write
V times A, so we move the discussion from finding the correct vector to finding the
right coefficient vector A, so this is kind of like a basis transformation, U is maybe
hard to find, but looking at the right way of it, so looking at the right choice of basis,
it's enough to discuss everything in terms of V, so that's why we do this, we insert
V here and only look at the correct coefficients A. Minus, well F is the same thing as U times
U transposed times F, because U times U transposed is the identity matrix, because U is an orthogonal
matrix, and this is G, so this is taking U out from the left, U times sigma V transposed,
well V transposed times V we can remove, so it's just sigma times A, minus U transposed
F which is G, and you may recall that the norm of orthogonal matrix times something
is the same as removing the orthogonal matrix, because this is just a rotation in Rn, so
it doesn't change the norm, we can just remove it, and we end up with this, sigma times A
minus G, you clean the norm squared. Now everything is nicer because it has the same form as what
Presenters
Zugänglich über
Offener Zugang
Dauer
00:28:19 Min
Aufnahmedatum
2021-11-15
Hochgeladen am
2021-11-15 10:56:04
Sprache
en-US