17 - Learning with energy-based models [ID:33910]

50 von 594 angezeigt

So hello everybody. Thank you very much Daniel and Leon for this kind invitation and for your

kind introduction. I just need to press continue again here. Yes as Leon was saying I'm learning

with I'm talking about learning with energy-based models and yeah let's start. Let's assume we have

given a pair of output and input variables so we will call y to be the input and input could be

a noisy image, could be a set of stereo images, images in this image sequence for computing

optical flow, could be measurements in a case space and so on and x we will call the output.

So for example the denoised image, the disparity image, the flow vectors and so on. So the basic

idea of energy-based models is very simple and I guess all of you know how it works. The task

of inferring x from the data y, from the given data y is defined by means of an energy minimization

or optimization approach. So you say my output x hat is the argument that minimizes a certain energy

e which is defined on the pair of input variables y and output variables x and what is the energy

doing in this assigns a certain energy to a combination of x and y and then of course everything

goes into the design of the energy to make this energy meaningful. There will be different

algorithms that can find a minimizer sometimes the minimizer is unique sometimes non-unique

if it's non-convex then we are already happy if we find that this is the stationary point.

So this heavily depends on how the energy is made could also be discrete later we'll show an example

where we do learning with a discrete energy where x for example is a discrete variable

it could be convex non-convex smooth non-smooth and so on.

So what are the main properties or what are the main interesting properties of energy-based models

so that they should be something preferred over a standard feed forward networks for example or

other machine learning algorithms. So first of all energies can be handcrafted based on

first principles so everybody knows the famous total variation which I will cover a little bit

later. So this is a typically handcrafted energy handcrafted regularizer that assigns a low energy

if the total amount of edges in images is more and the energy is higher if the total amount of edges

is larger and this is physics inspired because we know we have objects of different colors and

and by making a picture of a scene we have discontinuities between the colors exactly at

object edges. But energies can also be learned from data using supervised self-supervised or

unsupervised learning principles so this is basically the main focus of my talk today how

we can learn better energies from data. Energies can also allow for multiple solutions so for

example if you have a non-convex energy then there could be multiple values to this get sometimes

this is even preferable and will be handcrafted for example when you have ever played around with

soap bubbles or soap films and soap films are known to minimize the surface area and depending

on how you handle the soap film you will end up in different local minima so it's not always

a principle of the nature to really find the global minimum of energy. Then solutions

are usually characterized by optimality conditions and the optimality conditions can tell you how

the solution looks like for example that it's piecewise constant piecewise smooth piecewise

polynomial so you can find typically properties of your solutions from the energy by taking an

eye of the optimality conditions and the energy by its definition it provides a quality measure

for a particular candidate for example sometimes an energy contains a least squared stem so you can

direct the link it to a mean squared error for example the value of the energy. More than that

there's also a direct link to the statistic modeling and Bayesian inference so by simply

defining a density function p of x given by as the exponential of minus energy so it's a Gibbs

distribution so there's a direct link between how we design an energy and how this corresponds to

Bayesian modeling and then when once you have done this it's quite straightforward to generate

samples from the distribution and this is also important in applications by example if you would

like to run the statistics over samples with low energies you can use L'Enchere for example

or Hamiltonian sampling strategies. It's also very easy to introduce additional latent variables

for example you can define your energy E of x and y as the minimizer over latent variables set

of an energy that is defined in terms of three variables for example in the famous Manfrou-Jar

functional the edges in the Ambrosio-Tattorelli formulation are encoded via a phase field

Teil einer Videoserie :

AG Mathematics of Deep Learning

Zugänglich über

Offener Zugang

Dauer

01:04:59 Min

Aufnahmedatum

2021-06-01

Hochgeladen am

2021-06-07 21:08:03

Sprache

en-US

Thomas Pock on "Learning with energy-based models":

In this talk, I will show how to use learning techniques to significantly improve energy-based models. I will start by showing that even for the simplest models such as total variation, one can greatly improve the accuracy of the numerical approximation by learning the „best“ discretization within a class of consistent discretizations. Then I will move forward to more expressive models and show how they can be learned in order to give state-of-the art performance for image reconstruction problems, such as denoising, superresolution, MRI and CT. Finally, I will show how energy based models for image labeling such as Markov random fields can be used in the framework of deep learning.

Tags

Per RSS abonnieren