5 - FAU MoD Lecture: Measuring productivity and fixedness in lexico-syntactic constructions [ID:55853]

50 von 949 angezeigt

Okay, good afternoon, thank you for coming.

Today we are particularly happy we have two lectures, the first one by Professor Holger

Rauhbun, the next one we have a coffee break, right, in the department, and then there will

be the second lecture which is a professional collective lecture of the department by Professor

Rauhbun. I think you all know Holger Rauhbun, he is a professor at LNU. Before that, until

basically one year ago or so, he was a professor in Athens. Before that he even got a Ph.D.

in Bavaria in the Technical University of Munich, 20 years back. He is very well known

in different aspects. Probably you all know his book in data compression or sparse, how

you say, compressive sensing. More recently he has been very much engaged on the analysis

of machine learning and artificial intelligence algorithms, and this is the main topic of

his chair nowadays, and also the topic of the lecture which is hopefully to explain,

disclose what are the main reasons for this celebrated implicit bias phenomenon in machine

learning. Thank you for joining, thank you.

Thanks very much for the kind invitation and introduction. This is about the implicit bias

phenomenon in deep learning, but let's first go one step back. I guess we have all seen

the tremendous success of deep learning or artificial intelligence in general, and deep

learning is basically the key methodology that makes these breakthroughs work, like

an image recognition, sound recognition analysis of social networks, chat GPT, all these kind

of things. There is a lot of, I mean the basic construction of these systems use a lot of

mathematics, and this seems to work, but so far, I mean we haven't yet gotten to a situation

where we can really say we fully understand what's going on and why actually this is working,

so I'm trying to explain some of the phenomenon, but we are still at the very beginning here.

Yeah, so as a mathematician, one would ask oneself, like okay, this is sort of constructed using

mathematics, so we should also be able to understand something using mathematics, and in particular,

can we actually prove something about deep learning? Now, there are several mathematical aspects

here. One is optimization, so machine learning usually uses some kind of training data in order to adapt

a certain system to the real world, and so we do not do this by thinking about how the world works,

and then figure out principles like physics, but rather do this by example, so we give the system a lot of

examples, and the system has to figure out what are patterns, and the key methodology to fit these models

to data is by setting up an optimization problem and then solving it, but in particular, in deep learning,

this all leads to non-convex optimization problems, which are notoriously hard to understand, and so the question is

can we provide some understanding? Now, the next item here is generalization properties, so the crucial question is

then if we fit enough, hopefully fit enough data into the system, how does this perform on new data?

So we want to predict something, but we only have seen certain amounts of data and not seen every possible configuration

or possible situation, so the question is how do we perform on unseen data? Then there are questions of approximation theory,

stability properties, or designing certain networks for a specific task, but I will focus on these first two aspects,

optimization and generalization. One particular phenomenon I talk about is this implicit bias phenomenon,

and this will highlight also the role of sparsity or more generally networks of low complexity.

Okay, so what is a neural network? I guess most of you should have seen something like that.

So a network is built up of layers, and these layers are these simple affine functions composed with a non-linearity which acts component wise.

And so what we have to do, we have to adapt these matrices and these offsets,

such that the whole network, which is a composition of these layers, works well for certain tasks.

Okay, so this is just one simple example there. All kinds of networks, you can impose structure on these matrices and so on,

but I will not go much into different architectures, but you can imagine that there are lots of possibilities to design these networks,

and that's the art and practice to make, to find architectures which work well for your problem and path.

Okay, so the training works like you are given input and output pairs. So the X1, X2, and so on are inputs, and the Y1, Ym, and so on are labels.

And so what we want to do, we want to find a network such that for a given input, it basically reproduces the output, and hopefully does this also then on new data.

So what one usually does, one sets up so-called loss functional, and one starts with a loss function, which simply measures how far the output of a network on a given input is from this given label,

and then adds this up over all data. And so intuitively, if we minimize this over the parameters, we should get the neural network, which at least on these data, produces more or less the given output.

Okay, and so in practice, one uses rather simple algorithms, I mean first order methods. So we simply, well, we start at a certain point, like this is the initialization, and then we compute the gradient of the loss with respect to all of these matrices.

Teil einer Videoserie :

FAU MoD Lectures Series 2024/25

Presenters

Prof. Dr. Holger Rauhut

Prof. Dr. Christian Bär

Zugänglich über

Offener Zugang

Dauer

01:50:23 Min

Aufnahmedatum

2024-12-03

Hochgeladen am

2024-12-19 23:46:04

Sprache

en-US

Date: Tue. December 3, 2024
Event: FAU MoD Lecture
Event type: On-site / Online
Organized by: FAU MoD, the Research Center for Mathematics of Data at Friedrich-Alexander-Universität Erlangen-Nürnberg (Germany)

Session 01.

FAU MoD Lecture: The implicit bias phenomenon in deep learning
Speaker: Prof. Dr. Holger Rauhut
Affiliation: Mathematisches Institut der Universität München (Germany)

Session 02.

FAU MoD Lecture: Counterintuitive approximations
Speaker: Prof. Dr. Christian Bär
Affiliation: Institut für Mathematik. Universität Potsdam (Germany)

See more details of this FAU MoD lecture at:

https://mod.fau.eu/fau-mod-lecture-series-special-dec-2024/

Tags

Per RSS abonnieren