21 - Lecture_07_1_Proof_MinNorm [ID:37986]
50 von 145 angezeigt

Hi, we have to follow up with something that we have, well not forgotten but delayed, which

is the proof of theorem 3.4, which was, just a reminder, our main result about the minimum

norm solution. So let A be a matrix, m times n, and we write A in terms of its SVD, and

then the minimum norm solution, U m n f of A U is equal to f, well that's, that's the

minimum, maybe move this a bit, minimum solution that is given by, oh, okay, given by U m n

f is equal to A plus f, where A plus is the pseudo inverse, defined in the following way,

V times sigma plus times U transposed times f, and sigma plus was this diagonal matrix,

where we invert all those diagonal elements up to the point where they are still positive,

and then zeros afterwards, and then we fill this, I mean this diagonal matrix is always

a square matrix, but we make it fit the corresponding dimensions by padding with zeros at, either

by appending zero columns or zero rows. At this point, if you don't remember what the

minimum norm solution is, just pause and go back one or two weeks to the lecture where

we discussed this, so the main idea is that this may not always be possible, so maybe

we can't find a U which exactly matches this, but we can look for least square solutions,

and these are not unique, and of this choice of these square solutions, we pick the element

U with minimum norm, and that's called the minimum norm solution, so it's the least square's

approximation to a solution, and has minimum norm, and the statement of this theorem is

that we can explicitly calculate this minimum norm solution in this way. The proof goes

like this, first we write this matrix V in terms of its columns, so V1, Vn, where the

Vi are an orthonormal basis, this has to be true because V is an orthogonal matrix, and

an orthogonal matrix always has a property that its columns and its rows are orthonormal

vectors, and because it's a basis of Rn, then every U can be written as a sum of coefficients

times Vi, and the idea here is to make this ansatz to write U in terms of these basis

vectors Vi, and look for the correct coefficients alpha i instead, so looking for the right

vector U amounts to finding the correct combination coefficients alpha i, this is equivalent, because

the Vi are basis, so the goal is here, find alpha i such that U is the minimum norm solution.

The next step is to write G as U transposed F, this is basically nothing but looking at

this term and making it nicer, so we can always write the data F in this form, because U is

an invertible matrix, and U transposed also is an invertible matrix, so we don't break

anything, we just multiply the data by a suitable invertible matrix, this allows to some other

basis representation, it doesn't really matter, it's just a quicker way of writing the data,

so nothing really happens, it's just a reformulation of the same thing.

Now what is a minimum norm solution, first we have to find alpha i such that it is a

least square solution, and then we try to find the unique choice such that it has a

minimum norm, so let's first look at the first requisite, which is we want U to minimize

this misfit function, and this we can write, well A is U times sigma times V transposed,

times well U is in this form, and we can also write this as V, the matrix here, times a

vector A, where A is the vector with entries alpha 1 to alpha n, and this we can then write

V times A, so we move the discussion from finding the correct vector to finding the

right coefficient vector A, so this is kind of like a basis transformation, U is maybe

hard to find, but looking at the right way of it, so looking at the right choice of basis,

it's enough to discuss everything in terms of V, so that's why we do this, we insert

V here and only look at the correct coefficients A. Minus, well F is the same thing as U times

U transposed times F, because U times U transposed is the identity matrix, because U is an orthogonal

matrix, and this is G, so this is taking U out from the left, U times sigma V transposed,

well V transposed times V we can remove, so it's just sigma times A, minus U transposed

F which is G, and you may recall that the norm of orthogonal matrix times something

is the same as removing the orthogonal matrix, because this is just a rotation in Rn, so

it doesn't change the norm, we can just remove it, and we end up with this, sigma times A

minus G, you clean the norm squared. Now everything is nicer because it has the same form as what

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

00:28:19 Min

Aufnahmedatum

2021-11-15

Hochgeladen am

2021-11-15 10:56:04

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen