4 - FAU MoD Course (4/5): From Condensation to Loss Landscape Analysis [ID:57702]
50 von 674 angezeigt

So I think I can start now. So today is our actually second lecture that is closely related

to the theoretical understanding of the condensation phenomena. As you can see among all these five

lectures there are three that is three kind of lectures are about condensation. So because

in one way condensation is really kind of a highly nonlinear phenomena and on the other hand so it's

highly complicated it's it's it's I mean first of all it's a phenomenon that you can observe

during the training dynamics right so it's a kind of dynamical phenomena. On the other hand if you

ask why we observe this dynamic whole problem for the gradient flow over the lost landscape

right so there must be some structures in the lost landscape that lead you to these condensation

phenomenon. And then on the third part if condensation really happens that must give

you certain benefit on certain class of targets right and that's the topic I will tell I will

talk more about tomorrow. So today I'm focusing on the lost landscape that helps the condensation.

Yeah. Oh the condensation is what I've told so yesterday so it's a phenomena where different

neurons in a same layer has a tendency to align with one another. So that's the phenomena of

condensation. So that means when neurons in the same layer condense with one another we can cluster

the neurons into different groups right so there the effective number of neurons in that layer is

actually less than what is actually there. Okay so as I've told you in the first lecture that if we

are lucky enough to kind of see a very informative piece of that that new object and then we will

have a strong feeling that there should be neighboring pieces we could uncover. For example

you see a very clear phenomena of the condensation right and because you don't understand it therefore

you're trying to uncover the different neighboring pieces in order to help you have a better picture

of why there's a condensation right why we observe this piece here right it is because it belongs to

a kind of a bigger picture okay but there's the condensation is a kind of central piece that really

help help you uncover more and more pieces about this object. However if you look at other

phenomenons these many other phenomena are also very prominent phenomenons among their time

particularly like double descent it's very influential in the statistical community. However

these phenomena are limited not because these phenomena cannot be observed in certain situation

or it cannot be theorized but because this phenomena cannot help you uncover more right about

there's a real object there but condensation is different so and later I hope through all these

three lectures about condensation you could have a better feeling that the reason of condensation

is because really there's a something that is there and we can have a better picture through

all these works about condensation okay so and then now since we are mainly caring about the lost

landscape so what is a lost landscape it's very simple so yeah yeah no no whatever the training

data or the target the neural network itself if you initialize with a kind of small variance you

always observe condensation and the reason is about the lost landscape and you can see in this

lost landscape the lost part this this little L is trivial because usually for example we use some

convex loss for example L2 right therefore if you consider the distance between F and Y is convex

right and nothing but where does this in which sense we say this lost landscape is non-convex

it is because this F theta is nonlinear we put any linear model there it just remains convex right

and there's nothing surprise however if we put this F or we use neural network this kind of model

as as a model we also parameterization of these functions and then we arrive at a kind of could

be nasty and or we say at least non convex loss landscape therefore so lost landscape is just

about the function it's a high dimensional function of the loss or we can say empirical

risk regarding the parameter theta okay so why we call it lost landscape instead of just a function

right is because we kind of have some this kind of picture in our mind so okay this optimization

already trying to get some minimas there among these kind of a hues and all these kind of obstacles

right we're trying to get to the kind of place with this minimum loss and this picture is really

important and people whenever we have something even though this high dimensional things we still

have this similar picture in our mind and people also try to plot these kind of things 2d things

although you know that that is never true right it's a high dimensional function you are never

able to use a 2d kind of visualization to help you fully understand it but people still try to

Zugänglich über

Offener Zugang

Dauer

01:30:05 Min

Aufnahmedatum

2025-05-07

Hochgeladen am

2025-05-07 20:49:39

Sprache

en-US

Date: Fri. – Thu. May 2 – 8, 2025
FAU MoD Course: Towards a Mathematical Foundation of Deep Learning: From Phenomena to Theory
Session 4: From Condensation to Loss Landscape Analysis
Speaker: Prof. Dr. Yaoyu Zhang
Affiliation: Institute of Natural Sciences & School of Mathematical Sciences, Shanghai Jiao Tong University
Organizer: FAU MoD, Research Center for Mathematics of Data at FAU, Friedrich-Alexander-Universität Erlangen-Nürnberg
Overall, this course serves as a gateway to the vibrant field of deep learning theory, inspiring participants to contribute fresh perspectives to its advancement and application.
Session Titles:
1. Mysteries of Deep Learning
2. Frequency Principle/Spectral Bias
3. Condensation Phenomenon
4. From Condensation to Loss Landscape Analysis
5. From Condensation to Generalization Theory
 

Tags

FAU FAU MoD FAUMoD faumod course FAU MoD Course course
Einbetten
Wordpress FAU Plugin
iFrame
Teilen