Thank you very much, Leon, for the kind introduction and the invitation to speak.
Exactly. So I want to talk about structure preserving deep learning. And before I go into
the topic, let me quickly mention that this is a joint work with a number of collaborators.
So it's from the UK, Martin Benning, Christian Edman, Carola Schoendieb and Feria Scherri.
Also with collaborators from Norway, Elena Caledoni and Brynjolf Orin and from New Zealand,
Robert McLachlan. All right. Okay, so what does this talk about? I mean, so the similar series is
the mix of deep learning, so therefore I suppose my opening statement probably would be true for
almost every talk, I guess, that you will be hearing here, but it's towards sort of like having deep
learning with guarantees. And so in our work, we focus on a couple of certain key points,
which are, for instance, to develop neural networks, which are stable, which are invertible,
but also which may have other properties like equivariance and invariance. And another key
property is that the training, for instance, should have solutions and so on. I will talk
a bit more in detail, so like what these things even mean and what they mean in the context of
deep learning. In this talk today, I will cover two topics. So on the one hand, I will talk about
deep learning and differential equations, what's their connection and then how naturally optimal
control is appearing when you do supervised learning. But also then I will talk about stability
of neural networks and deep limits, which means sort of like the number of layers goes to infinity,
then what happens to the network or what happens to the solution of the training process.
The other topic of the talk is structure preserving deep neural networks. So having neural networks,
which have guarantees, which are provably equivariant or invertible. And again, so like I
would talk about what this means in the context of neural networks and what are the pros and cons.
The papers that this talk is based on are basically three papers, although mostly the first paper that
you see here, which is a preprint and will probably be published very soon. And so like that's kind
of like it's a review paper about structure preserving deep learning and that's what the
talk is mainly based on. And then there are two other papers where I will draw from and these
are a paper by myself and collaborators on deep learning as optimal control problems.
Published 2019 and a similar paper by Elder Hauber and last root auto 2018.
All right. And so basically, so like all the references that you see this talk, they're also
all being referred in these three papers. So kind of like if you want to go into more detail, you can
check it there. Okay. So what is this all about? So I guess you all know about deep learning, but let
me say a few words about deep learning that you can see basically like my point of view on deep
learning. So if you for instance, want to do classification with deep learning, then so one
strategy would be to have a deep neural network to transform your data and then to do linear
classification. And so why is this necessary? So like there might be some data as you can see here
at the top left, where kind of one class of data is here, red to the other class blue. And you can
clearly see you can't really separate the data in the domain itself, I mean, as the data is.
And so therefore, what you can do, you can try to nonlinearly transform the data.
That's what you can see here in the middle. And now they become linearly separable,
as you can see here with the black line. And then this corresponds to a nonlinear classification
in the original domain, as you can see at the top right. So by putting together some nonlinear
transform and linear classification, then this leads to a nonlinear classification in the original
domain. And so here now it's like all the blue dots are correctly classified as blue and the
red dots are correctly classified as red. Okay, so and this nonlinear transformation, you can do
with a deep neural network. And so often you see a picture of the deep neural network with circles
and arrows. And so you have an input layer, and this has been transformed to some hidden layers
a few times, and then there's the output layer. And so the input layer could be your data. I don't
know if you want to classify, let's say cats and dogs, and the input layer could be an image of a
dog or a cat. And then this being transformed, and then the output layer, and then in this case,
would be a one-dimensional label, would just like to be a scalar or some other probability vector
that says is this a dog or is this a cat? And so there's been an event a long, long time ago,
Zugänglich über
Offener Zugang
Dauer
01:07:37 Min
Aufnahmedatum
2021-01-27
Hochgeladen am
2021-02-01 12:09:02
Sprache
en-US
Matthias Ehrhardt on "Structure Preserving Deep Learning"
Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the trade-off between computational effort, amount of data and model complexity is required to successfully design a deep learning approach for a given problem. A large amount of progress made in deep learning has been based on heuristic explorations, but there is a growing effort to mathematically understand the structure in existing deep learning methods and to systematically design new deep learning methods to preserve certain types of structure in deep learning. In this talk, we review a number of these directions: some deep neural networks can be understood as discretisations of
dynamical systems, neural networks can be designed to have desirable properties such as invertibility or group equivariance, and new algorithmic frameworks based on conformal Hamiltonian systems and Riemannian manifolds to solve the optimisation problems have been proposed. This is joint work with Martin Benning, Elena Celledoni, Christian Etmann, Robert McLachlan, Brynjulf Owren, Carola-Bibiane Schoenlieb and Ferdia Sherry.