So hello everybody. Thank you very much Daniel and Leon for this kind invitation and for your
kind introduction. I just need to press continue again here. Yes as Leon was saying I'm learning
with I'm talking about learning with energy-based models and yeah let's start. Let's assume we have
given a pair of output and input variables so we will call y to be the input and input could be
a noisy image, could be a set of stereo images, images in this image sequence for computing
optical flow, could be measurements in a case space and so on and x we will call the output.
So for example the denoised image, the disparity image, the flow vectors and so on. So the basic
idea of energy-based models is very simple and I guess all of you know how it works. The task
of inferring x from the data y, from the given data y is defined by means of an energy minimization
or optimization approach. So you say my output x hat is the argument that minimizes a certain energy
e which is defined on the pair of input variables y and output variables x and what is the energy
doing in this assigns a certain energy to a combination of x and y and then of course everything
goes into the design of the energy to make this energy meaningful. There will be different
algorithms that can find a minimizer sometimes the minimizer is unique sometimes non-unique
if it's non-convex then we are already happy if we find that this is the stationary point.
So this heavily depends on how the energy is made could also be discrete later we'll show an example
where we do learning with a discrete energy where x for example is a discrete variable
it could be convex non-convex smooth non-smooth and so on.
So what are the main properties or what are the main interesting properties of energy-based models
so that they should be something preferred over a standard feed forward networks for example or
other machine learning algorithms. So first of all energies can be handcrafted based on
first principles so everybody knows the famous total variation which I will cover a little bit
later. So this is a typically handcrafted energy handcrafted regularizer that assigns a low energy
if the total amount of edges in images is more and the energy is higher if the total amount of edges
is larger and this is physics inspired because we know we have objects of different colors and
and by making a picture of a scene we have discontinuities between the colors exactly at
object edges. But energies can also be learned from data using supervised self-supervised or
unsupervised learning principles so this is basically the main focus of my talk today how
we can learn better energies from data. Energies can also allow for multiple solutions so for
example if you have a non-convex energy then there could be multiple values to this get sometimes
this is even preferable and will be handcrafted for example when you have ever played around with
soap bubbles or soap films and soap films are known to minimize the surface area and depending
on how you handle the soap film you will end up in different local minima so it's not always
a principle of the nature to really find the global minimum of energy. Then solutions
are usually characterized by optimality conditions and the optimality conditions can tell you how
the solution looks like for example that it's piecewise constant piecewise smooth piecewise
polynomial so you can find typically properties of your solutions from the energy by taking an
eye of the optimality conditions and the energy by its definition it provides a quality measure
for a particular candidate for example sometimes an energy contains a least squared stem so you can
direct the link it to a mean squared error for example the value of the energy. More than that
there's also a direct link to the statistic modeling and Bayesian inference so by simply
defining a density function p of x given by as the exponential of minus energy so it's a Gibbs
distribution so there's a direct link between how we design an energy and how this corresponds to
Bayesian modeling and then when once you have done this it's quite straightforward to generate
samples from the distribution and this is also important in applications by example if you would
like to run the statistics over samples with low energies you can use L'Enchere for example
or Hamiltonian sampling strategies. It's also very easy to introduce additional latent variables
for example you can define your energy E of x and y as the minimizer over latent variables set
of an energy that is defined in terms of three variables for example in the famous Manfrou-Jar
functional the edges in the Ambrosio-Tattorelli formulation are encoded via a phase field
Zugänglich über
Offener Zugang
Dauer
01:04:59 Min
Aufnahmedatum
2021-06-01
Hochgeladen am
2021-06-07 21:08:03
Sprache
en-US
Thomas Pock on "Learning with energy-based models":
In this talk, I will show how to use learning techniques to significantly improve energy-based models. I will start by showing that even for the simplest models such as total variation, one can greatly improve the accuracy of the numerical approximation by learning the „best“ discretization within a class of consistent discretizations. Then I will move forward to more expressive models and show how they can be learned in order to give state-of-the art performance for image reconstruction problems, such as denoising, superresolution, MRI and CT. Finally, I will show how energy based models for image labeling such as Markov random fields can be used in the framework of deep learning.