So, but as far as organization goes my name is Tobias Wurfel.
I'm a PhD student at the Pattern Recognition Lab and today I'm gonna be your substitute
for Professor inn mi r.
So, last lecture you heard it be Nissan but that was fake news, it's me today.
So, I hope the camera is actually up and running. So today we're going to talk about loss functions and optimization.
So over the last lectures you had some introduction to deep learning and in the last lecture we actually pieced this introduction together
and had the very basic concept of a whole neural network, which covers everything which you will need for the first exercise.
Now that we covered the material for the first exercise and have gained a broad overview over neural networks, we will dive into specifics.
And so the most important specifics we are going to dive first into is the formulation of the loss function.
So last time you learned about the softmax loss function, which is actually a combination of an activation function and a loss function.
And this time we are going to look into what other loss functions do we have and how is this softmax cross entropy actually derived.
After that we will have a look into optimization.
So last time you learned we are doing gradient descent.
This time we will look into a bit more detail what are we doing there and can we do a bit more than gradient descent.
So first loss functions and then optimization.
Actually we don't need a cursor.
So first off as I said we are going to talk about loss functions.
So basically the loss function depends on what you want to do.
It always describes, so the loss function is another thing which is set in stone, which you have to choose according to a problem.
Instead you can choose, you can model what you want with a loss function.
So there is no definitive choice what is the best.
Instead it is problem dependent and what you are going to design about the problem.
But there are like those two broad class of machine learning problems which are regression problems and classification problems.
Also keep in mind I emphasize problems here because sometimes people say
they solve a certain problem with a regression.
Which is actually the other way around because regression is the problem setting
and there are certain methods which we call regression methods which solve the regression problem.
And those two problem settings regression and classification turn out to be quite related.
So the basic difference is in regression you want to estimate a continuous output.
Like on those slides we have on the right side a regression problem
where you see some data points and for now they have just two simple dimensions x1 and x2.
And we try to estimate a continuous valued variable based on those two variables.
Which is the slope of this line you see.
Whereas in classification we want to estimate discrete property.
Like the discrete property I outlined in those images you have for example those red dots and you have those blue dots.
And they correspond to discrete different classes.
Like my classic example is cats and dogs.
So only two discrete values where in regression you have continuous values.
And now you still want to estimate that line but this time it is the line which sets apart those two problems.
So the main point you want to take away here is regression continuous valued output classification discrete valued output.
And depending on the problem setting we will have different loss functions.
And we will start with actually regression later on.
But first I want to make an important disclaimer.
So last time we told you about this softmax loss.
And we actually told you that it is a misnomer.
That softmax is actually an activation function and the loss function is cross entropy loss.
So what is going on there? How can I tell those two things apart?
What is an activation function and what is a loss function?
So basically the activation functions in between the layers you wouldn't confuse with a loss function.
Because they are just inside the neural network.
Presenters
M. Sc. Tobias Würfl
Zugänglich über
Offener Zugang
Dauer
01:38:06 Min
Aufnahmedatum
2018-10-30
Hochgeladen am
2018-10-31 01:19:03
Sprache
en-US
Deep Learning (DL) has attracted much interest in a wide range of applications such as image recognition, speech recognition and artificial intelligence, both from academia and industry. This lecture introduces the core elements of neural networks and deep learning, it comprises:
-
(multilayer) perceptron, backpropagation, fully connected neural networks
-
loss functions and optimization strategies
-
convolutional neural networks (CNNs)
-
activation functions
-
regularization strategies
-
common practices for training and evaluating neural networks
-
visualization of networks and results
-
common architectures, such as LeNet, Alexnet, VGG, GoogleNet
-
recurrent neural networks (RNN, TBPTT, LSTM, GRU)
-
deep reinforcement learning
-
unsupervised learning (autoencoder, RBM, DBM, VAE)
-
generative adversarial networks (GANs)
-
weakly supervised learning
-
applications of deep learning (segmentation, object detection, speech recognition, ...)