Okay, thanks a lot, Martin.
It's a great pleasure to be here.
It's a new experience for me.
It's the first time I do this.
I hope everything will go well, I'm sure.
So I wanted to do some kind of overview of optimal transport from a computational perspective.
And also highlight some recent work that we have been doing, which are about trying to
scale optimal transport to high dimension, both in terms of competition time and also
in terms of statistical performance.
It's a joint work with a lot of people, in particular, of course, with Marco Cuturi,
who is my main collaborator on these lines of ideas.
And it's also in large part the PhD work of Haute-Jean Ve, at least for the machine learning
part.
All right, before starting, just some shameless advertisement for this book.
So feel free to go online to retrieve the PDF.
It will detail everything that I would say today and much more.
And it's totally free and open source.
And you also have Python codes to go with it.
All right, so first a bit of motivation.
Why optimal transport?
First, actually why trying to model problems with probability distribution, histograms,
and so on.
I think it's fairly obvious that in the most imaging processing field, there is some opportunity
for using this type of models.
You can think about modeling the colors or modeling feature of image, also for shapes
or neuroscience.
I think it really makes sense to try to manipulate some kind of a density of points or density
of electric activity, for instance, in the brain.
And I would say more recently in a high dimensional setup, both supervised learning and unsupervised
learning, there is an interest in trying to introduce some tools to manipulate densities.
For instance, in natural language processing, there is a breakthrough the last few years,
I would say, that tries to model the data sets of text using world embedding.
So you see the text as a point cloud in a very high dimensional space for doing supervised
learning, for instance.
And also for, for instance, generative models of images.
There is this idea also trying to manipulate the image, I think, as a big point cloud in
very high dimensional space.
Okay, so in all these type of problems, the key question you're asking is to try to fit
some prior models, some parametric model, let's call it alpha, depending on from parameter
theta to a huge point counts.
So it's a basic problem in image processing, shape matching, generative models, where you
try to model the data using a deep nets.
And really the goal is to try to fit the model, to try to make the model as close as possible
to the point cloud.
So you can frame this question as trying to minimize some kind of discrepancy capital
D between your model and your data.
It's pretty obvious.
And I would say many problems can be framed like this.
And the key question here is what should be the capital D that you want to use to address
this efficiently.
Zugänglich über
Offener Zugang
Dauer
00:46:55 Min
Aufnahmedatum
2020-04-20
Hochgeladen am
2020-04-21 12:46:14
Sprache
en-US
Gabriel Peyre
Scaling Optimal Transport for High dimensional Learning
Optimal transport (OT) has recently gained lot of interest in machine learning. It is a natural tool to compare in a geometrically faithful way probability distributions. It finds applications in both supervised learning (using geometric loss functions) and unsupervised learning (to perform generative model fitting). OT is however plagued by the curse of dimensionality, since it might require a number of samples which grows exponentially with the dimension. In this talk, I will review entropic regularization methods which define geometric loss functions approximating OT with a better sample complexity. More information and references can be found on the website of our bookComputational Optimal Transport.