So it's 8.30, let's begin with the lecture.
Good morning everyone.
As you can see, I'm still not Professor Maya, so you have the pleasure of listening to me
again today and maybe next week.
So be excited about that.
Maybe a short recap of what we did or what we talked about last week.
So you kind of learned about the very basic elements of a fully connected neural network.
Of course, the fully connected layer, which gives it its name.
But you also heard about activation functions and, for example, the relu, the sigmoid, or
the ton H. As well as kind of the last activation function that you would put in front of the
loss function, the softmax.
And you talked about the cross entropy.
And in addition to that, you discussed how to actually propagate gradients throughout
the network.
So how can you then use whatever the loss function computes to kind of get gradients
to the point where you can actually update the weights?
And today we are going to elaborate on that a little bit.
Not necessarily the back propagation, but rather we are going to talk about how do we
actually define this loss?
What is this loss telling us?
How can we theoretically reason what we are going to use as a loss function?
And in addition to that, we are going to talk about how can we update the weights then using
the gradients that we compute with back propagation.
So you talked about gradient descent already.
And we are going to look into variants of gradient descent or how we can actually compute
the weight update that we then use in this setting.
Okay, so these are the two topics, loss functions on one hand and optimization on the other.
So let's start with loss functions, but before that, very short recap.
So we have regression problems today and classification problems that we are going to look at.
And again, classification is where we estimate a categorical variable basically.
So we have certain samples and we are trying to estimate a discrete class label for each
of them.
In contrast to that, we have regression.
So we are not trying to discriminate distinct class labels, for example, pears and apples
that we had in the first lecture, but instead we are trying to estimate the value of a continuous
function.
So you can see that here, for example, we were trying to estimate a line through our
data.
And basically, here the challenge is to estimate the parameters of this function that we are
trying to estimate.
In contrast to that, in classification, we are trying to estimate the parameters of our
classification model.
So for example, here we are trying to estimate the parameters of the decision boundary between
those two classes.
What I also want to emphasize is kind of the difference between the loss function and the
last activation function in a neural network.
This is a very important distinction that we need to make here for different reasons.
So the last activation function is applied on individual samples or on each sample that
you put into your network.
So basically, it's completely independent of any kind of collection of data that you
Presenters
Zugänglich über
Offener Zugang
Dauer
01:21:33 Min
Aufnahmedatum
2019-05-09
Hochgeladen am
2019-05-16 07:39:03
Sprache
en-US
Deep Learning (DL) has attracted much interest in a wide range of applications such as image recognition, speech recognition and artificial intelligence, both from academia and industry. This lecture introduces the core elements of neural networks and deep learning, it comprises:
-
(multilayer) perceptron, backpropagation, fully connected neural networks
-
loss functions and optimization strategies
-
convolutional neural networks (CNNs)
-
activation functions
-
regularization strategies
-
common practices for training and evaluating neural networks
-
visualization of networks and results
-
common architectures, such as LeNet, Alexnet, VGG, GoogleNet
-
recurrent neural networks (RNN, TBPTT, LSTM, GRU)
-
deep reinforcement learning
-
unsupervised learning (autoencoder, RBM, DBM, VAE)
-
generative adversarial networks (GANs)
-
weakly supervised learning
-
applications of deep learning (segmentation, object detection, speech recognition, ...)