Okay, so thank you very much for the invitation and also like you welcome everyone to One
World Seminar. So it's a good way not to travel too much. Okay, so there's a
representer theorem in the title. You all know what's machine learning and inverse
problems and so representer theorem really refers to a way of actually
representing the solution of sort of optimization problems. Okay, so now to
set the stage we have like the variational formulation of inverse
problems which is sort of the standard way of formulating this kind of problem
mathematically. So you have some unknown objects that since we're an integral
operator, a microscope, an MRI, you have sources of noise, this produces blurring
and then the problem is given that those measurements which are usually
discrete, how can we recover here the unknown here in concentration of your
force in 3D. And so the usual way then is to state that as an optimization
problem and what you are doing you are sort of enforcing a consistency between
let's say your reconstruction here, simulated measurements of your
reconstruction compared to your measurements wide and of course because
the problem is usually ill-posed, you're also like imposing some regularization
like for example putting a penalty on the L1 or L2 norm of some operator
applied to your signal. Okay, so that's inverse problems but in fact learning,
there was machine learning in the title, is also a linear inverse problem but an
infinite dimensional one. And so here the situation is similar so you're
getting given like a series of data points so x are some data points and
maybe y should be some outcome that you associate to a given pattern here. Now
you want to find a function that goes from Rn to R, such that the function
applied to your data point is your predictor here ym. And again this is a
ill-posed problem in principle and so early on people have reformulated that
using regularization theory and so again here you are introducing energy
functional for example L2 norm of some operator like the Laplacian applied to
your function f and you're trying here to solve then this data consistency
constraint here that you want your y to be, I mean the f of x here at the data
points be relatively close to your y and then here you are minimizing here
this regularization term here. And this you can also reframe in terms of
linear least squares like least square fit where using Lagrange multipliers and
have this kind of problem. And now if you solve that kind of problem remarkably
using the theory of reproducing kernel Hilbert spaces you can find that the
solution is a kernel estimator and this is very much the foundation of all
classical machine learning. And here is this famous representer theorem for
machine learning and what does it tell us? It tells us actually that the
minimizer of that over a certain Hilbert space H and now I remind you this is an
infinite dimensional problem because f is just a function in a certain Hilbert
space and so the representer theorem tells us that the solution of that
problem rather remarkably is a linear combination of kernels. Okay so those
kernels are those guys are H that depends here on x which is the free
variable and xm here is the location of your data points. And now what is this
RH? This is the so-called reproducing kernel of a reproducing kernel Hilbert
space and such spaces are actually characterized in the sense that they
exist a single kernel with something that goes from RD to RD into R so X being
in RD and so it is a it is a kernel that if you fix one of the variable this
thing lies in the Hilbert space and the other thing is the reproducing property
that if you now fix at that particular location and leave this variable open
and make the inner product with a function f that this will sample the
Zugänglich über
Offener Zugang
Dauer
00:50:39 Min
Aufnahmedatum
2020-06-08
Hochgeladen am
2020-06-08 23:26:37
Sprache
en-US
Regularization addresses the ill-posedness of the training problem in machine learning or the reconstruction of a signal from a limited number of measurements. The standard strategy consists in augmenting the original cost functional by an energy that penalizes solutions with undesirable behaviour. In this presentation, I will present a general representer theorem that characterizes the solutions of a remarkably broad class of optimization problems in Banach spaces and helps us understand the effect of regularization. I will then use the theorem to retrieve some classical characterizations such as the celebrated representer theorem of machine leaning for RKHS, Tikhonov regularization, representer theorems for sparsity promoting functionals, as well as a few new ones, including a result for deep neural networks.