Yes, so I've designed it to be slow paced so you can essentially interrupt me at any
times. So this talk is going to be an invitation to explore the role of kernel methods in learning
and solving differential equations. Now before I get into the main topic of this talk, I will
give a short reminder on scalar-valued kernels to make sure that everyone is on the same page.
I will do so in the setting of the following interpolation problem in which you try to
approximate an unknown target function f dagger mapping some input space x to the real line given
that f dagger of x is equal to y. Here I write x and y for the n-dimensional vectors whose entries
are the input output data points and I write f dagger of x for the n-dimensional vector
whose entries are the images of the input data points and the f dagger.
Now there are four equivalent ways to describe kernel-based solutions to this problem
and these are based on the following four equivalent ways of defining such kernels.
The first one is to define a scalar-valued kernel as a function k mapping the input space x times x
onto the real line such that for all integer m and all points x1, xm in the input space
the m by m matrix with the following entries is symmetric and positive.
Now equivalently k is a scalar-valued kernel if and only if there exists a Hilbert space f
known as a feature space and a function psi mapping x to f known as a feature map such that k x x prime
is equal to the inner product in the space f between psi of x and psi of x prime.
Now this is equivalent to the existence of a Hilbert space of functions f mapping the input space x
to the real line to the real line such that f of x is equal to the inner product in the space h between
f and the function defined by the kernel k supported at the point x. The norm associated
with the inner product in h is known as the RKHS norm associated with the kernel k
and we will also write it as this k norm for ease of representation. Now this is equivalent to the
existence of a centered Gaussian process x such that k is the covariance function of that Gaussian
process. Recall that x is a function mapping x to a Gaussian space which is a Hilbert space of
scalar valued Gaussian random variable and that the covariance function of psi is the function
obtained by computing the covariance between psi at x and psi at x prime for all x and x prime.
Now these four equivalent definitions lead to four equivalent solutions to our non-interpolation
problem. The first one is to approximate the target function f dagger with the following
function f where k is our given kernel k of capital X capital X is the n by n matrix with
the following entries and k of little x capital X is the n dimensional vector with the following entries.
The second one is to approximate the target function with the following inner product involving
the feature map psi associated with the kernel k and some coefficient c living in a feature space
f identified by minimizing its norm subject to the interpolation constraints. Therefore you can
think of kernel interpolation also known as kriging as a linear interpolation in feature space
and the main idea is to first map the data to a Hilbert space via possibly non-linear
feature map and then to linearly approximate that target function in that feature space.
The third one is to approximate the target function with the minimizer of the function
with the minimizer of the following problem in which the norm to be minimized is the RKHS
norm defined by the kernel k. Recall that since these methods are equivalents this norm is also
defining the kernel k. This approach is also known as optimal recovery in numerical approximation
and it can be traced back to the work of Micheli and Rivlin.
The fourth one is to approximate the target function by the expectation of the Gaussian process
with covariance function k conditioned on the interpolation constraints.
Okay so why are kernel methods relevant to numerical approximation? Well as observed by
Sord, Larkin, Dichonis and many others who have investigated intriguing similarities between Gaussian
process regression and numerical approximation most numerical approximation methods are actually
kernel interpolation methods. For example the cardinal splines of Schoenberg are optimal recovery
splines that is kernel interpolants obtained from zero one measurements and the
defined by polynomial kernels. This is also true for the polyharmonic splines of Hardy, DeMari
and Duchon which can be identified as optimal recovery splines. To describe this consider the
Zugänglich über
Offener Zugang
Dauer
00:40:50 Min
Aufnahmedatum
2021-10-08
Hochgeladen am
2021-10-14 13:16:04
Sprache
en-US